Reader small image

You're reading from  Power BI Machine Learning and OpenAI

Product typeBook
Published inMay 2023
Reading LevelIntermediate
PublisherPackt
ISBN-139781837636150
Edition1st Edition
Languages
Concepts
Right arrow
Author (1)
Greg Beaumont
Greg Beaumont
author image
Greg Beaumont

Greg Beaumont is a data architect at Microsoft, where he enjoys identifying and solving complex problems backed by his experience in data architecture and a passion for innovation. Focusing on the healthcare industry, Greg works closely with customers to plan enterprise analytics strategies, evaluate new tools and products, conduct training sessions and hackathons, and architect solutions that improve the quality of care and reduce costs. He strives to be a trusted advisor to his customers and is always seeking new ways to drive progress and help organizations thrive. He is a veteran of the Microsoft data speaker network and has worked with hundreds of customers on their data management and analytics strategies.
Read more about Greg Beaumont

Right arrow

Considerations for ML

Now that you’ve created a preliminary data model that will serve as the basis for analytic reporting in Power BI, you start thinking about a process for creating tables of data to be used with Power BI machine learning. You will need to create a single table of flattened data for each machine learning model that you train, test, and deploy.

Creating tables of data to train a machine learning model entails treating each column as a feature of the algorithm that you will be training and then using to make predictions. For example, if you wanted to create a machine learning algorithm that predicts whether something is an insect, the features (ML terminology for columns on a single table) might be [Six Legs Y/N?], [Life Form Y/N?], [Count of Eyes], and [Weight], and then a column that will be predicted, such as [Insect Y/N?]. Each row would represent something that is being evaluated for a prediction to answer the question, “Is this an insect?”

You decide to take the following approach, in the following order, so that you can do everything within Power BI:

  1. Data exploration and initial data model creation in Power BI Desktop Power Query.
  2. Analytic report created in Power BI.
  3. Feature discovery in Power BI.
  4. Create training data sets in Power Query.
  5. Move training data sets to Power BI dataflows.
  6. Train, test, deploy a Power BI machine learning model in Power BI dataflows.

This process is shown in Figure 1.22.

Figure 1.22 – All of the ETL (extract, transform, load) will happen in Power BI Power Query and Power BI dataflows

Figure 1.22 – All of the ETL (extract, transform, load) will happen in Power BI Power Query and Power BI dataflows

Power BI ML offers three different types of predictive model types. Those types, as defined in the Power BI service, are as follows:

  • A binary prediction model predicts whether an outcome will be achieved. Effectively, a prediction of “Yes” or “No” is returned.
  • General classification models predict more than two possible outcomes such as A, B, C, or D.
  • A regression model will predict a numeric value along a spectrum of possible values. For example, it will predict the costs of an event based on similar past events.

As part of your preliminary planning, you consider how these options could map to the deliverables that were prioritized by your stakeholders:

  • Analytic report: This deliverable will be a Power BI analytic report and could use some Power BI AI features, but it will not be a Power BI ML model. The analytic report will help you explore and identify the right data for Power BI machine learning models.
  • Predict damage: Predicting whether or not damage will result from a wildlife strike is a good match for a binary prediction model since the answer will have two possible outcomes: yes or no.
  • Predict size: Predicting the size of the wildlife that struck an aircraft based upon factors such as damage cost, damage location, height, time of year, and airport location will probably have multiple values that can be predicted such as Large, Medium, and Small. This requirement could be a good fit for a general classification model.
  • Predict height: This deliverable predicts the height at which wildlife strikes will happen and provides that prediction as a numeric value representing height above ground level in feet. It is likely a good fit for a regression model, which predicts numeric values.

There is no way of knowing with certainty whether the FAA Wildlife Strike data will support these specific use cases, but you won’t know until you try! Discovery is a key part of the process. First, you must identify features in the data that might have predictive value, and then train and test the machine learning models in Power BI. Only then will you know what types of predictions might be possible for your project.

Previous PageNext Page
You have been reading a chapter from
Power BI Machine Learning and OpenAI
Published in: May 2023Publisher: PacktISBN-13: 9781837636150
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Greg Beaumont

Greg Beaumont is a data architect at Microsoft, where he enjoys identifying and solving complex problems backed by his experience in data architecture and a passion for innovation. Focusing on the healthcare industry, Greg works closely with customers to plan enterprise analytics strategies, evaluate new tools and products, conduct training sessions and hackathons, and architect solutions that improve the quality of care and reduce costs. He strives to be a trusted advisor to his customers and is always seeking new ways to drive progress and help organizations thrive. He is a veteran of the Microsoft data speaker network and has worked with hundreds of customers on their data management and analytics strategies.
Read more about Greg Beaumont