Reader small image

You're reading from  Power BI Machine Learning and OpenAI

Product typeBook
Published inMay 2023
Reading LevelIntermediate
PublisherPackt
ISBN-139781837636150
Edition1st Edition
Languages
Concepts
Right arrow
Author (1)
Greg Beaumont
Greg Beaumont
author image
Greg Beaumont

Greg Beaumont is a data architect at Microsoft, where he enjoys identifying and solving complex problems backed by his experience in data architecture and a passion for innovation. Focusing on the healthcare industry, Greg works closely with customers to plan enterprise analytics strategies, evaluate new tools and products, conduct training sessions and hackathons, and architect solutions that improve the quality of care and reduce costs. He strives to be a trusted advisor to his customers and is always seeking new ways to drive progress and help organizations thrive. He is a veteran of the Microsoft data speaker network and has worked with hundreds of customers on their data management and analytics strategies.
Read more about Greg Beaumont

Right arrow

Discovering New Features Using R and Python Visuals

In Chapter 5 of this book, you discovered new columns and features for your queries using Power BI Desktop that will be migrated to the Power BI cloud service to train and test ML models. During your exploration and discovery of the data you also expanded the Power BI report that you will deliver to end users for interactive data exploration.

Before migrating your solution to the Power BI cloud service, you will take one last pass over the FAA Wildlife Strike data and seek out additional features to be added to the ML queries that will be used to build the Power BI ML models. In order to add some diverse capabilities to the analytic report, you’ll leverage R/Python visuals within Power BI. At the end of this chapter, you will be ready to publish your solution to the Power BI cloud service.

Technical requirements

For this chapter, you’ll need the following resources:

Exploring data with R visuals

Power BI has the capability to run R scripts and display R visuals. R is a powerful language that is often used by data scientists for statistics and ML. You will need to install R on your local machine to use it with Power BI Desktop per the instructions at https://learn.microsoft.com/en-us/power-bi/create-reports/desktop-r-visuals.

There are numerous R visualizations that can be useful for data analysis and finding new features for ML models. The FAA Wildlife Strike data contains several True/False flags related to the portions of the aircraft struck, the location of damage, the ingestion of animals into the engines, and more. These values should fit nicely on an R correlation plot, which will graphically show flags that tend to correlate either positively or negatively. Let’s give it a shot!

You will follow three steps to find new features:

  1. Prep are the data for the R correlation plot.
  2. Build the R correlation plot visualization...

Exploring data with Python visuals

In addition to R, Power BI also supports Python queries and visuals. Python is a very popular language that is also frequently used by data scientists. Per the requirements at the beginning of this chapter, you’ll need to install Python on your local machine for Power BI Desktop: https://learn.microsoft.com/en-us/power-bi/connect-data/desktop-python-visuals.

In the FAA Wildlife Strike data, Height and Speed are both fields that can be recorded for reports. Height is a measure in feet from the ground at which an incident happened, while speed is a measure of the speed the aircraft was traveling when it was struck by wildlife. You will take a look at both of these metrics using Python histograms so that you can compare the distribution of those values when selecting different filters.

You will follow these steps:

  1. Preparing the data for the Python histogram.
  2. Building the Python histogram visualization and add it to your report...

Adding new features to the ML queries

So far in this chapter, you have identified numerous new features to be added to the Predict Damage, Predict Size, and Predict Height ML queries for your Power BI ML models. As you did in section three of Chapter 5, Adding New Features to the ML Queries in Power Query, you can add these features to the ML queries in Power Query:

  1. Double-click on Remove Other Columns under Applied Steps.
  2. Add each of the features in Figure 6.9 (also include Speed and Height).

Your screen should look like this while adding the features:

Figure 6.15 – Select the columns to be added to the ML query

Figure 6.15 – Select the columns to be added to the ML query

After adding the new features, you may note that Height and Speed both contain some empty values. Since these are not categorical fields, there is no simple option for adding a text value such as empty. For example, with Speed, the impact of a collision at 5 knots versus 500 knots should be expected to be very different...

Summary

In this chapter, you added R and Python visuals to your Power BI reports to discover new features in the FAA Wildlife Strike data. Using an R correlation plot, you were able to interactively slice and dice several incident flag values for positive and negative correlations. With Python histograms you took a look at the impact of speed and height on the outcomes for your planned Power BI ML models. Finally, you added new features to your Predict Damage, Predict Size, and Predict Height ML queries that will be used for ML in Power BI.

In the next chapter, you will begin migrating content to the Power BI cloud service. After migrating the Power BI dataset and report, you will then migrate the Power Query scripts to dataflows for use with Power BI ML.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Power BI Machine Learning and OpenAI
Published in: May 2023Publisher: PacktISBN-13: 9781837636150
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Greg Beaumont

Greg Beaumont is a data architect at Microsoft, where he enjoys identifying and solving complex problems backed by his experience in data architecture and a passion for innovation. Focusing on the healthcare industry, Greg works closely with customers to plan enterprise analytics strategies, evaluate new tools and products, conduct training sessions and hackathons, and architect solutions that improve the quality of care and reduce costs. He strives to be a trusted advisor to his customers and is always seeking new ways to drive progress and help organizations thrive. He is a veteran of the Microsoft data speaker network and has worked with hundreds of customers on their data management and analytics strategies.
Read more about Greg Beaumont