Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Power BI Machine Learning and OpenAI

You're reading from  Power BI Machine Learning and OpenAI

Product type Book
Published in May 2023
Publisher Packt
ISBN-13 9781837636150
Pages 308 pages
Edition 1st Edition
Languages
Concepts
Author (1):
Greg Beaumont Greg Beaumont
Profile icon Greg Beaumont

Table of Contents (21) Chapters

Preface Part 1: Data Exploration and Preparation
Chapter 1: Requirements, Data Modeling, and Planning Chapter 2: Preparing and Ingesting Data with Power Query Chapter 3: Exploring Data Using Power BI and Creating a Semantic Model Chapter 4: Model Data for Machine Learning in Power BI Part 2: Artificial Intelligence and Machine Learning Visuals and Publishing to the Power BI Service
Chapter 5: Discovering Features Using Analytics and AI Visuals Chapter 6: Discovering New Features Using R and Python Visuals Chapter 7: Deploying Data Ingestion and Transformation Components to the Power BI Cloud Service Part 3: Machine Learning in Power BI
Chapter 8: Building Machine Learning Models with Power BI Chapter 9: Evaluating Trained and Tested ML Models Chapter 10: Iterating Power BI ML models Chapter 11: Applying Power BI ML Models Part 4: Integrating OpenAI with Power BI
Chapter 12: Use Cases for OpenAI Chapter 13: Using OpenAI and Azure OpenAI in Power BI Dataflows Chapter 14: Project Review and Looking Forward Index Other Books You May Enjoy

Iterating Power BI ML models

In Chapter 8, you trained Power BI ML models using all of the features that you had selected for each of the three ML models – that is, Predict Damage ML, Predict Size ML, and Predict Height ML – using data from the FAA Wildlife Strike database. In Chapter 9, you evaluated the test results of the automated training and testing process that is part of Power BI. The test results helped you understand the strengths and weaknesses of the predictive models, along with details about features that contributed to correct predictions.

This chapter will revisit the findings from Chapter 9 and use them to decide if you need to modify and retrain the ML models to achieve better results via iterative development. The list of features that are used to train these ML models can be whittled down, the filter criteria can be adjusted, and the result of the new round of training and testing can be compared to those from Chapter 9.

Technical requirements

The requirements for this chapter are the same as the preceding chapters:

  • FAA Wildlife Strike data files from either the FAA website or the Packt GitHub site
  • A Power BI Pro license
  • One of the following Power BI licensing options for access to Power BI dataflows:
    • Power BI Premium
    • Power BI Premium Per User
  • One of the following options for getting data into the Power BI cloud service:
    • Microsoft OneDrive (with connectivity to the Power BI cloud service)
    • Microsoft Access and Power BI Gateway
    • Azure Data Lake (with connectivity to the Power BI cloud service)

Considerations for ML model iterations

Numerous books have been written about ML and reasons that ML models perform well or poorly, including books from Packt Publishing. The purpose of this book is to help you learn Power BI so that you can explore the FAA Wildlife Strike data, analyze that data, and then create SaaS ML models. At this point in this book, you are at a crossroads. Do you continue to iterate these ML models in the SaaS tool? Have you demonstrated enough value to hand an ML model project over to a data science team who will improve upon the model using Azure ML or advanced tools? Or do you go back to your stakeholders, report your findings, and ask for guidance on the next steps? The following diagram shows a few options for the next steps you could consider:

Figure 10.1 – Possible next steps for your Power BI ML models

Figure 10.1 – Possible next steps for your Power BI ML models

Rather than diving into the technicalities of ML theory, you will focus on a few possible causes of inaccuracy that...

Assessing the Predict Damage binary prediction ML model

The Predict Damage ML model that you built and reviewed in the previous two chapters is designed to predict the likelihood that damage was reported due to wildlife striking an aircraft. A few key metrics from the training report for that binary prediction model can be seen in the following table:

Assessing the Predict Size ML classification model

The Predict Size ML model was an attempt at building an ML classification model to predict if the size of a wildlife strike was Small, Medium, or Large. The following table shows some key metrics about the initial version of the ML model:

Metric name

Metric value

Comments

Area Under the Curve (AUC)

91%

The AUC indicates the performance of an ML model, with 100% being perfect. 50% would be random guessing, while less than 50% indicates predictions worse than random guessing.

Row Count for Training

23,356

The number of rows used to train the ML model.

Row Count for Testing

...

Metric Name

Metric Value

Comments

AUC

60%

The AUC indicates the performance of an ML model, with 100% being perfect. 60% is better than random guessing, but not very good!

Row Count for Training

11,368

Number of rows used to train the ML model

Row Count for Testing

2,841

Number of rows used to test against the trained ML model

Figure 10.5 – Key metrics...

Assessing the Predict Height ML regression model

The Predict Height ML model is a regression model that’s designed to predict the height at which an aircraft was impacted by wildlife. The regression ML model predicts a numeric value representing height in feet from the ground, at which an impact happened based on the features in the report. Features such as Speed, Distance, and Phase of Flight were listed as top predictors.

80% of the variation in the testing results is explained by the model. Is 80% good? It depends on the use case and the requirements! If the variation (R squared) is 100%, then the ML model will give perfect predictions. 80% could indicate that the predictions are good but that independent and random variables might be 100% impossible. Or, maybe a higher value is possible and the data is either missing important features or measures are inaccurate.

In this use case, common sense dictates that explaining 100% of the variation would be impossible. You...

Summary

In this chapter, you reviewed each of the ML models that you have built. You decided to seek guidance on the next steps for the Predict Damage ML model from either a data science team or your stakeholders. For the Predict Size ML model, you found only slight predictive value and will need to seek guidance for your next course of action. The Predict Height ML model improved when you added new filter criteria and whittled down the feature selection, and the results are promising. At this point, you must either work with a data science team or circle back with your stakeholders for guidance on future plans for the model.

In Chapter 11, you will bring in newly added data from the FAA Wildlife Strike database and run it through your Predict Damage ML model to test the results. In doing so, you will learn how to score new data with your ML model whenever data refreshes in Power BI. You will also explore opportunities to find new value by adding Microsoft OpenAI capabilities to...

lock icon The rest of the chapter is locked
You have been reading a chapter from
Power BI Machine Learning and OpenAI
Published in: May 2023 Publisher: Packt ISBN-13: 9781837636150
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}