Reader small image

You're reading from  Power BI Machine Learning and OpenAI

Product typeBook
Published inMay 2023
Reading LevelIntermediate
PublisherPackt
ISBN-139781837636150
Edition1st Edition
Languages
Concepts
Right arrow
Author (1)
Greg Beaumont
Greg Beaumont
author image
Greg Beaumont

Greg Beaumont is a data architect at Microsoft, where he enjoys identifying and solving complex problems backed by his experience in data architecture and a passion for innovation. Focusing on the healthcare industry, Greg works closely with customers to plan enterprise analytics strategies, evaluate new tools and products, conduct training sessions and hackathons, and architect solutions that improve the quality of care and reduce costs. He strives to be a trusted advisor to his customers and is always seeking new ways to drive progress and help organizations thrive. He is a veteran of the Microsoft data speaker network and has worked with hundreds of customers on their data management and analytics strategies.
Read more about Greg Beaumont

Right arrow

Use Cases for OpenAI

In the previous chapter, you scored fresh data via the Power BI ML models and assessed the output in comparison to the automated testing performed by Power BI during the training phase. The FAA Wildlife Strike database provided fresh data that was generated in the real world beyond the scope of the training and testing datasets. This data could potentially serve as a framework for scheduling the scoring of new data utilizing a Power BI ML model in collaboration with dataflows. The recently evaluated data produced outcomes that were relatively consistent with the expected results derived from the testing data.

In this chapter, you are tasked by your stakeholders to incorporate OpenAI functionalities into the solution. OpenAI is garnering a lot of attention in the IT sector, and this project is being implemented during this trend. Although this entails a change in scope, the project’s beneficiaries are fully supportive of and optimistic about this initiative...

Technical requirements

The requirements are slightly different for this chapter:

  • An account with the original open source OpenAI: https://openai.com/.
  • Optional – Azure OpenAI in your Azure subscription: https://azure.microsoft.com/en-us/products/cognitive-services/openai-service. The book is written so that this is optional since it is not available to everyone at the time of publication.
  • FAA Wildlife Strike data files from either the FAA website or the Packt GitHub site.
  • A Power BI Pro license.
  • One of the following Power BI licensing options for access to Power BI dataflows:
    • Power BI Premium
    • Power BI Premium Per User
  • One of the following options for getting data into the Power BI cloud service:
    • Microsoft OneDrive (with connectivity to the Power BI cloud service)
    • Microsoft Access + Power BI Gateway
    • Azure Data Lake (with connectivity to the Power BI cloud service)

Generating descriptions with OpenAI

Our first step will be to identify a suitable use case for leveraging the power of GPT models to generate descriptions of elements of FAA Wildlife Strike data. Our objective is to unlock the potential of external data by creating prompts for GPT models that can provide detailed information and insights about the data we are working with. Through this use case, we will explore the value that GPT models can bring to the table when it comes to data analysis and interpretation.

For example, a description of the FAA Wildlife Strike database by ChatGPT might look like this:

Figure 12.2 – OpenAI ChatGPT description of FAA Wildlife Strike database

Figure 12.2 – OpenAI ChatGPT description of FAA Wildlife Strike database

Within your solution using the FAA Wildlife Strike database, you have data that could be tied to external data using the GPT models. A few examples include additional information about the following:

  • Airports
  • FAA regions
  • Flight operators
  • Aircraft
  • Aircraft...

Summarizing data with OpenAI

You can also use OpenAI GPT models to summarize data. Numerous databases feature free text fields that comprise entries from a diverse array of sources, including survey results, physician notes, feedback forms, and comments regarding incident reports for the FAA Wildlife Strike database that we have used in this book. These text entry fields represent a wide range of content, from structured data to unstructured data, making it challenging to extract meaning from them without the assistance of sophisticated natural language processing tools.

The Remarks field of the FAA Wildlife Strike database contains text that was presumably entered by people involved in filling out incident forms about aircraft striking wildlife. A few examples of the remarks for recent entries are shown in Power BI in the following screenshot:

Figure 12.6 – Examples of remarks from the FAA Wildlife Strike database

Figure 12.6 – Examples of remarks from the FAA Wildlife Strike database

You will notice that the remarks...

Choosing GPT models for your use cases

OpenAI and Azure OpenAI offer several different GPT models that can be called iteratively using an API. At the time of writing this book, there is limited availability of the new GPT-4 models, which are the latest and greatest releases. The GPT-3.5 models are available in both OpenAI and Azure OpenAI, with a few different options. The following information was referenced on March 26, 2023, from the OpenAI website at this link: https://platform.openai.com/docs/models/gpt-4.

...

Summary

In this chapter, you have delved into the fundamental concepts associated with OpenAI and Microsoft Azure OpenAI, and how these platforms can be employed to generate and summarize text. Moreover, you have explored several options for integrating GPT models from both OpenAI and Azure OpenAI into your Power BI solution using FAA Wildlife Strike data. Following a careful evaluation process, it has been determined that the text-davinci-003 GPT model will be utilized for the summarization of remarks present in FAA Wildlife Strike data reports, and for generating novel descriptive information about airplanes within the reports.

Chapter 13 will be dedicated to the implementation of functions within Power BI dataflows, enabling the seamless calling of OpenAI and Azure OpenAI REST APIs for data. These APIs will facilitate the successful implementation of your summarization and descriptive generation use cases, thereby providing new capabilities for your solution to address the challenges...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Power BI Machine Learning and OpenAI
Published in: May 2023Publisher: PacktISBN-13: 9781837636150
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Greg Beaumont

Greg Beaumont is a data architect at Microsoft, where he enjoys identifying and solving complex problems backed by his experience in data architecture and a passion for innovation. Focusing on the healthcare industry, Greg works closely with customers to plan enterprise analytics strategies, evaluate new tools and products, conduct training sessions and hackathons, and architect solutions that improve the quality of care and reduce costs. He strives to be a trusted advisor to his customers and is always seeking new ways to drive progress and help organizations thrive. He is a veteran of the Microsoft data speaker network and has worked with hundreds of customers on their data management and analytics strategies.
Read more about Greg Beaumont

Latest model

Description

Max tokens

Training data

gpt-3.5-turbo

Most capable GPT-3.5 model and optimized for chat at one-tenth the cost of text-davinci-003. Will be updated with our latest model iteration.

4,096 tokens

Up to September 2021