Reader small image

You're reading from  Power BI Machine Learning and OpenAI

Product typeBook
Published inMay 2023
Reading LevelIntermediate
PublisherPackt
ISBN-139781837636150
Edition1st Edition
Languages
Concepts
Right arrow
Author (1)
Greg Beaumont
Greg Beaumont
author image
Greg Beaumont

Greg Beaumont is a data architect at Microsoft, where he enjoys identifying and solving complex problems backed by his experience in data architecture and a passion for innovation. Focusing on the healthcare industry, Greg works closely with customers to plan enterprise analytics strategies, evaluate new tools and products, conduct training sessions and hackathons, and architect solutions that improve the quality of care and reduce costs. He strives to be a trusted advisor to his customers and is always seeking new ways to drive progress and help organizations thrive. He is a veteran of the Microsoft data speaker network and has worked with hundreds of customers on their data management and analytics strategies.
Read more about Greg Beaumont

Right arrow

Deploying Data Ingestion and Transformation Components to the Power BI Cloud Service

In Chapter 6, you finalized the base design for your ML queries, which will be migrated to the Power BI cloud service to train and test ML models. You focused on using R and Python visuals within Power BI Desktop to visualize and evaluate potential features for these ML queries.

This chapter will be an adventure into the Power BI cloud service. You will migrate your work in Power Query to dataflows and publish your Power BI dataset and report to a Power BI workspace. The process of moving these queries is a repetitive but necessary step for your end-to-end project, the workshop that runs in parallel with this book. An experienced Power BI developer can probably move through this chapter quickly by cutting and pasting the M queries from GitHub. By the end of this chapter, your content will be fully migrated to the Power BI cloud service and ready for Power BI ML.

Technical requirements

For this chapter, you’ll need the following resources:

  • Power BI Desktop April 2023 or later (no licenses required)
  • FAA Wildlife Strike data files from either the FAA website or the Packt GitHub site
  • Power BI Pro license
  • One of the following Power BI licensing options for access to Power BI dataflows:
    • Power BI Premium
    • Power BI Premium Per User
  • One of the following options for getting data into the Power BI cloud service:
    • Microsoft OneDrive (with connectivity to the Power BI cloud service)
    • Microsoft Access and Power BI Gateway
    • Azure Data Lake (with connectivity to the Power BI cloud service)

Creating a Power BI workspace

Before we start importing content into the Power BI cloud service, you will need a workspace for the project. A workspace is a way to organize, secure, and govern content in the Power BI cloud service. For this project, you need a workspace that supports the use of both dataflows and ML, which at the time of writing requires either Power BI Premium with a Pro license or a Premium Per User license. If you do not have either of these licenses, you can still follow along with this book for learning purposes and explore the code samples in the Packt GitHub repository.

Workspaces can be extended to include integration with security capabilities, information protection, deployment pipelines for life cycle management, and more. This book will only cover how to create a basic workspace since extensive documentation about workspaces is available online. A tutorial for creating workspaces can be found at https://learn.microsoft.com/en-us/power-bi/collaborate...

Publishing your Power BI Desktop dataset and report to the Power BI cloud service

Next, you must import your dataset and report from Power BI Desktop into the Power BI cloud service. Once published to the cloud service, you will be able to share the analytical report with others who are stakeholders in the project. You will also be able to view the reports on the Power BI mobile app if you want to dive into the data while on the go.

The work that you have done up to this point used Power BI Desktop on your local machine. You have two options for migrating this content to the Power BI cloud service:

Both options are equally simple. For this tutorial, you will import the .pbix file from the Power BI service.

From your newly created Power BI workspace, select Upload | OneDrive for Business, and select...

Creating Power BI dataflows with connections to source data

The Power BI Desktop Power Query work from previous chapters is connected to data sources from your local machine. Power BI dataflows is a very similar tool to Power Query, but connectivity happens from the Power BI cloud service. When creating your dataflows, you will need to consider connectivity to the data sources. Earlier in this book, you determined that the sources of data for the FAA Wildlife Strike database were as follows:

  • wildlife.accdb: All of the historical FAA Wildlife Strike reports. This file is an Access database that’s been downloaded in ZIP file format from the FAA website.
  • read_me.xls: Descriptive information about the data in the Database.accdb database file. This file is an Excel file that was downloaded within the same ZIP file as the Access database. The file has been changed to a .xlsx extension in the Packt GitHub repository and is available in the folder at https://github.com/PacktPublishing...

Adding a dataflow for ML queries

Now that you’ve ingested, cleaned up, and transformed the data from the FAA Wildlife Strike database, you can build out your specialized queries for Power BI ML models. Before you get started, note that Power BI ML is a version of Azure AutoML that has been built into Power BI as a SaaS offering. Data science teams using advanced tools will often apply transformations to data, such as imputing missing values, normalizing numeric ranges, and weighting features within a model. The advanced transformations of features won’t be covered in this book since AutoML has featurization capabilities to optimize data for ML. The queries you will be creating could probably be improved upon with advanced featurization techniques, but for this project, we will keep things simple and let the AutoML featurization capabilities in Power BI ML handle some of the advanced feature transformations.

Adding the Predict Damage ML query to a dataflow

You will...

Summary

In this chapter, you migrated queries from Power BI Desktop Power Query to dataflows in the Power BI cloud service. These queries ingest, prep, and create tables designed for your Power BI dataset. Then, you migrated your ML queries from Power Query for Power BI Desktop to dataflows in the Power BI cloud service. In doing so, you created a new dataflow that is populated by the dataflows you created in the previous chapter. The new ML Queries dataflow was saved and refreshed in your Power BI workspace.

In Chapter 8, you will begin working with Power BI ML in the cloud. You will use the three ML queries you created here to build and test the Binary Prediction, Categorical, and Regression ML models in Power BI.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Power BI Machine Learning and OpenAI
Published in: May 2023Publisher: PacktISBN-13: 9781837636150
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Greg Beaumont

Greg Beaumont is a data architect at Microsoft, where he enjoys identifying and solving complex problems backed by his experience in data architecture and a passion for innovation. Focusing on the healthcare industry, Greg works closely with customers to plan enterprise analytics strategies, evaluate new tools and products, conduct training sessions and hackathons, and architect solutions that improve the quality of care and reduce costs. He strives to be a trusted advisor to his customers and is always seeking new ways to drive progress and help organizations thrive. He is a veteran of the Microsoft data speaker network and has worked with hundreds of customers on their data management and analytics strategies.
Read more about Greg Beaumont