Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Extending Power BI with Python and R - Second Edition
Extending Power BI with Python and R - Second Edition

Extending Power BI with Python and R: Perform advanced analysis using the power of analytical languages, Second Edition

By Luca Zavarella
$15.99 per month
Book Mar 2024 814 pages 2nd Edition
eBook
$43.99 $29.99
Print
$54.99
Subscription
$15.99 Monthly
eBook
$43.99 $29.99
Print
$54.99
Subscription
$15.99 Monthly

What do you get with a Packt Subscription?

Free for first 7 days. $15.99 p/m after that. Cancel any time!
Product feature icon Unlimited ad-free access to the largest independent learning library in tech. Access this title and thousands more!
Product feature icon 50+ new titles added per month, including many first-to-market concepts and exclusive early access to books as they are being written.
Product feature icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Product feature icon Thousands of reference materials covering every tech concept you need to stay up to date.
Subscribe now
View plans & pricing

Product Details


Publication date : Mar 29, 2024
Length 814 pages
Edition : 2nd Edition
Language : English
ISBN-13 : 9781837639533
Category :
Table of content icon View table of contents Preview book icon Preview Book

Extending Power BI with Python and R - Second Edition

Where and How to Use R and Python Scripts in Power BI

Power BI is Microsoft’s flagship self-service business intelligence product. It consists of a set of on-premises applications and cloud-based services that help organizations integrate, transform, and analyze data from a wide variety of source systems through a user-friendly interface.

The platform is not limited to data visualization. Power BI is much more than this when you consider that its analytics engine (Vertipaq) is the same as SQL Server Analysis Services (SSAS), Azure Analysis Services, and Power Pivot in Excel and it is also the engine used for reports and datasets published to the Power BI service. In addition, it uses Power Query as its data extraction and transformation engine, which we find in both Analysis Services and Excel. The engine comes with a very powerful and versatile formula language (M) and GUI, thanks to which you can “grind” and shape any type of data into any form.

Moreover, Power BI supports DAX as a data analytics formula language, which can be used for advanced calculations and queries on data that has already been loaded into tabular data models.

Such a versatile and powerful tool is a godsend for anyone who needs to do data ingestion and transformation in order to build dashboards and reports to summarize a company’s business.

Recently, the availability of huge amounts of data, along with the ability to scale the computational power of machines, has made the area of advanced analytics more appealing. So, new mathematical and statistical tools have become necessary in order to provide rich insights. Hence the integration of analytical languages such as Python and R within Power BI.

R or Python scripts can only be used within Power BI with specific features. Knowing which Power BI tools can be used to inject R or Python scripts into Power BI is key to understanding whether the problem you want to address is achievable with these analytical languages.

This chapter will cover the following topics:

  • Injecting R or Python scripts into Power BI
  • Using R and Python to interact with your data
  • Python and R compatibility across Power BI products

Technical requirements

This chapter requires you to have Power BI Desktop already installed on your machine (you can download it here: https://aka.ms/pbiSingleInstaller). The version used in this chapter is 2.110.1161.0 64-bit (October 2022).

Injecting R or Python scripts into Power BI

In this first section, Power BI Desktop tools that allow you to use Python or R scripts will be presented and described in detail. Specifically, you will see how to add your own code during the data loading, data transforming, and data viewing phases.

Data loading

One of the first steps required to work with data in Power BI Desktop is to import it from external sources:

There are many connectors that allow you to do this, depending on the respective data sources, but you can also do it via scripts in Python and R. In fact, if you click on the Get data icon in the ribbon, not only are the most commonly used connectors shown but you can also select other ones from a more complete list by clicking on More...:

Graphical user interface, text, application  Description automatically generated

Figure 1.1: Browse more connectors to load your data

In the new Get Data window that pops up, simply start typing the word script into the search box, and immediately the two options for importing data via Python or R appear:

Figure 1.2 – Showing R script and Python script into the Get Data window

Figure 1.2: Showing R script and Python script in the Get Data window

Reading the contents of the tooltip, obtained by hovering the mouse over the Python script option, two things should immediately jump out at you:

  1. A local installation of Python is required.
  2. What can be imported through Python is a data frame.

The same two observations also apply when selecting R script. The only difference is that it is possible to import a pandas DataFrame when using Python (a DataFrame is a data structure provided by the pandas package), whereas R employs the two-dimensional array-like data structure called an R data frame, which is provided by default by the language.

After clicking on the Python script option, a new window will be shown containing a text box for writing the Python code:

Figure 1.3 – Window showing the Python script editor

Figure 1.3: Window showing the Python script editor

As you can see, it’s definitely a very skimpy editor, but in Chapter 3, Configuring Python with Power BI, you’ll discover how you can utilize your preferred IDE to create your scripts within a more comprehensive and feature-rich editor.

Taking a look at the warning message, Power BI reminds you that no Python engine has been detected, so it must be installed. If you already have Python installed and configured, you will not see this message. Clicking on the How to install Python link will cause a Microsoft Docs web page to open, explaining the steps to install Python.

Microsoft suggests installing the base Python distribution, but in order to follow some best practices on environments (self-contained spaces that allow developers to manage dependencies, libraries, and configurations specific to individual projects), we will install the Miniconda distribution. The details of how to do this and why will be covered in Chapter 3.

If you had clicked on R script instead, a window for entering code in R, similar to the one shown in Figure 1.4, would have appeared:

Figure 1.4 – Window showing the R script editor

Figure 1.4: Window showing the R script editor

As with Python, in order to run code in R, you need to install the R engine on your machine. Clicking on the How to install R link will open a Docs page where Microsoft suggests installing either Microsoft R Open or the classic CRAN R. Chapter 2, Configuring R with Power BI, will show you which engine to choose and how to configure your favorite IDE to write code in R.

In order to import data using Python or R, you need to write code in the editors shown in Figure 1.3 and Figure 1.4, which assign a pandas DataFrame or an R data frame to a variable, respectively. You will see concrete examples throughout this book.

Next, let’s look at transforming data.

Data transformation

It is possible to apply a transformation to data already imported or being imported, using scripts in R or Python. Should you want to test this on the fly, you can import the following CSV file directly from the web: http://bit.ly/iriscsv. Follow these steps:

  1. Simply click on Get data and then Web to import data directly from a web page:
Graphical user interface, application  Description automatically generated

Figure 1.5: Select the Web connector to import data from a web page

  1. You can now enter the previously mentioned URL in the window that pops up:
Figure 1.6 – Import the Iris data from the web

Figure 1.6: Import the Iris data from the web

  1. Right after clicking OK, a window will pop up with a preview of the data to be imported.

    In this case, instead of importing the data as is, click on Transform Data in order to access the Power Query data transformation window:

    Table  Description automatically generated

    Figure 1.7: Imported data preview

    It is at this point that you can add a transformation step using a Python or R script by selecting the Transform tab in Power Query Editor:

    Graphical user interface, table, Excel  Description automatically generated

    Figure 1.8: R and Python script tools in Power Query Editor

    By clicking on Run Python script, you’ll cause a window, similar to the one you’ve already seen in the data import phase, to pop up:

    Graphical user interface, text, application, email  Description automatically generated

    Figure 1.9: The Run Python script editor

    If you carefully read the comment in the text box, you will see that the dataset variable is already initialized and contains the data present at that moment in Power Query Editor, including any transformations already applied. At this point, you can insert your Python code in the text box to transform the data into the desired form.

    A similar window will open if you click on Run R script:

    Graphical user interface, text, application, email  Description automatically generated

    Figure 1.10: The Run R script editor

Also, in this case, the dataset variable is already initialized and contains the data present at that moment in Power Query Editor. You can then add your own R code and reference the dataset variable to transform your data in the most appropriate way.

Next, let’s look at visualizing data.

Data visualization

Finally, your own Python or R scripts can be added to Power BI to create new visualizations, in addition to those already present in the tool out of the box:

  1. Assuming we resume the data import activity started in the previous section, once the Iris dataset is loaded, simply click Cancel in the Run R script window, and then click Close & Apply in the Home tab of Power Query Editor:
Figure 1.11 – Click Close & Apply to import the Iris data

Figure 1.11: Click Close & Apply to import the Iris data

  1. After the data import is complete, you can select either the R visual or Python visual option in the Visualizations pane of Power BI:
    Diagram  Description automatically generated

    Figure 1.12: The R and Python script visuals

    If you click on Python visual, a window pops up asking for permission to enable script code execution, as there may be security or privacy risks:

    Figure 1.13 – Enable the script code execution

    Figure 1.13: Enable the script code execution

  1. After enabling code execution, in Power BI Desktop, you can see a placeholder for the Python visual image on the report canvas and a Python script editor pane at the bottom:
Graphical user interface, application  Description automatically generated

Figure 1.14: The Python visual layout

Once you drag the fields you want to use in your Python script into the Values area, you can write your own custom code into the Python script editor and run it to generate a Python visualization.

A pretty much identical layout occurs when you select R visual.

Using R and Python to interact with your data

In the previous section, you saw all the ways you can interact with your data in Power BI via R or Python scripts. Beyond knowing how and where to inject your code into Power BI, it is very important to know how your code will interact with that data. It’s here that we see a big difference between the effect of scripts injected via Power Query Editor and scripts used in visuals:

  • Scripts via Power Query Editor: This type of script will transform the data and persist transformations in the model. This means that it will always be possible to retrieve the transformed data from any object within Power BI. Also, once the scripts have been executed and have taken effect, they will not be re-executed unless the data is refreshed. Therefore, it is recommended to inject code in R or Python via Power Query Editor when you intend to use the resulting insights in other visuals, or in the data model.
  • Scripts in visuals: The scripts used within the R and Python script visuals extract particular insights from the data and only make them evident to the user through visualization. Like all the other visuals on a report page, the R and Python script visuals are also interconnected with the other visuals. This means that the script visuals are subject to cross-filtering and therefore, they are refreshed every time you interact with other visuals in the report. That said, it is not possible to persist the results obtained from the script visuals in the data model.

TIP

Thanks to the interactive nature of R and Python script visuals due to cross-filtering, it is possible to inject code that is useful for extracting real-time insights from data. The important thing to keep in mind is that, as previously stated, it is then only possible to visualize such information, or at the most, to write it to external repositories (as you will see in Chapter 8, Logging Data from Power BI to External Sources). Also, although it is possible to access resources on the internet from a visual script when developing in Power BI Desktop, it is no longer possible to do so when the report is published to the Power BIs Service (you will see what this is about in the next section) due to security issues. This restriction doesn’t exist for scripts used in Power Query.

In the final section of this chapter, let’s look at the limitations of using R and Python when it comes to various Power BI products.

Python and R compatibility across Power BI products

The first question once you are clear on where to inject R and Python scripts in Power BI could be: “Is the use of R and Python code allowed in all Power BI products?” In order to cover that, let’s briefly recap the various Power BI products and their usage in general. Here is a concise list:

  • Power BI service: This is sometimes called Power BI Online, and it’s the Software as a Service (SaaS) version of Power BI. It was created to facilitate the sharing of visual analysis between users through dashboards and reports.
  • Power BI Report Server: This is the on-premises version of Power BI and it extends the capabilities of SQL Server Reporting Services, enabling the sharing of reports created in Power BI Desktop (for Report Server) and Power BI Report Builder for Power BI paginated reports.
  • Power BI embedded: A Microsoft Azure service that allows dashboards and reports to be embedded in an application for users who do not have a Power BI account.
  • Power BI Desktop: A free desktop application for Windows that allows you to use almost all of the features that Power BI offers. It is not the right tool for sharing results between users, but it allows you to share them on the Power BI service and Power BI Report Server. The desktop versions that allow publishing on the two mentioned services are distinct and support slightly different sets of features. They are named Power BI Desktop and Power BI Desktop for Power BI Report Server, respectively.
  • Power BI mobile: A mobile application, available on Windows, Android, and iOS, that allows secure access to the Power BI service and Power BI Report Server, and that allows you to browse and share dashboards and reports, but not edit them.
  • Power BI Report Builder: A free desktop application for Windows that allows you to create paginated reports. These can then be published and shared in the Power BI service and Power BI Report Server.

Apart from the licenses, which we will not go into here, a summary figure of the relationships between the previously mentioned products follows:

Diagram  Description automatically generated

Figure 1.15: Interactions between Power BI products

Unfortunately, of all these products, only the Power BI service, Power BI Embedded, and Power BI Desktop allow you to enrich data via code in R and Python:

Diagram  Description automatically generated with low confidence

Figure 1.16: Power BI products, compatibility with R and Python

IMPORTANT NOTE

From here on out, when we talk about the Power BI service in terms of compatibility with analytical languages, what we say will also apply to Power BI embedded.

So, if you need to develop reports using advanced analytics through R and Python, make sure the target platform supports them.

Summary

This chapter has given a detailed overview of all the ways in which you can use R and Python scripts in Power BI Desktop. During the data ingestion and data transformation phases, Power Query Editor allows you to add steps containing R or Python code. You can also make use of these analytical languages during the data visualization phase thanks to the R and Python script visuals provided by Power BI Desktop.

It is also very important to know how the R and Python code will interact with the data already loaded or being loaded in Power BI. If you use Power Query Editor, both when loading and transforming data, the result of script processing will be persisted in the data model. Also, if you want to run the same scripts again, you have to refresh the data. On the other hand, if you use the R and Python script visuals, the code results can only be displayed and are not persisted in the data model. In this case, script execution occurs whenever cross-filtering is triggered via the other visuals in the report.

Unfortunately, at the time of writing, you cannot run R and Python scripts in every Power BI product. The only ones that provide for running analytics scripts are Power BI Desktop and the Power BI service.

In the next chapter, we will see how best to configure the R engine and RStudio to integrate with Power BI Desktop.

Test your knowledge

  1. At what stages of Power BI report development can scripts in Python or R be used?
  2. Is it possible to use a Python dictionary as a data source for a Power BI report?
  3. Is it possible to use an R list as a data source for a Power BI report?
  4. When you insert a Python or R script step immediately after other transformation steps that return a specific result set, what is the name of the variable you will need to use in your script to access the data obtained in the step just before?
  5. After adding a Python or R visual script to your canvas, what do you need to do to enable the script editor to take a script that generates a plot?
  6. What are the ways by which to force the re-execution of Python or R scripts added via Power Query?
  7. What are the ways by which to force the re-execution of Python or R scripts added in a script visual?
  8. In what case is it not possible to access the internet from a Python or R script in Power BI?
  9. In which Power BI products can Python or R scripts be used?

Learn more on Discord

To join the Discord community for this book – where you can share feedback, ask questions to the author, and learn about new releases – follow the QR code below:

https://discord.gg/MKww5g45EB

Left arrow icon Right arrow icon
Download code icon Download Code

Key benefits

  • Discover best practices for using Python and R in Power BI by implementing non-trivial code
  • Enrich your Power BI dashboards using external APIs and machine learning models
  • Create any visualization, as complex as you want, using Python and R scripts

Description

The latest edition of this book delves deep into advanced analytics, focusing on enhancing Python and R proficiency within Power BI. New chapters cover optimizing Python and R settings, utilizing Intel's Math Kernel Library (MKL) for performance boosts, and addressing integration challenges. Techniques for managing large datasets beyond available RAM, employing the Parquet data format, and advanced fuzzy matching algorithms are explored. Additionally, it discusses leveraging SQL Server Language Extensions to overcome traditional Python and R limitations in Power BI. It also helps in crafting sophisticated visualizations using the Grammar of Graphics in both R and Python. This Power BI book will help you master data validation with regular expressions, import data from diverse sources, and apply advanced algorithms for transformation. You'll learn how to safeguard personal data in Power BI with techniques like pseudonymization, anonymization, and data masking. You'll also get to grips with the key statistical features of datasets by plotting multiple visual graphs in the process of building a machine learning model. The book will guide you on utilizing external APIs for enrichment, enhancing I/O performance, and leveraging Python and R for analysis. You'll reinforce your learning with questions at the end of each chapter.

What you will learn

Configure optimal integration of Python and R with Power BI Perform complex data manipulations not possible by default in Power BI Boost Power BI logging and loading large datasets Extract insights from your data using algorithms like linear optimization Calculate string distances and learn how to use them for probabilistic fuzzy matching Handle outliers and missing values for multivariate and time-series data Apply Exploratory Data Analysis in Power BI with R Learn to use Grammar of Graphics in Python

What do you get with a Packt Subscription?

Free for first 7 days. $15.99 p/m after that. Cancel any time!
Product feature icon Unlimited ad-free access to the largest independent learning library in tech. Access this title and thousands more!
Product feature icon 50+ new titles added per month, including many first-to-market concepts and exclusive early access to books as they are being written.
Product feature icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Product feature icon Thousands of reference materials covering every tech concept you need to stay up to date.
Subscribe now
View plans & pricing

Product Details


Publication date : Mar 29, 2024
Length 814 pages
Edition : 2nd Edition
Language : English
ISBN-13 : 9781837639533
Category :

Table of Contents

27 Chapters
Preface Chevron down icon Chevron up icon
Where and How to Use R and Python Scripts in Power BI Chevron down icon Chevron up icon
Configuring R with Power BI Chevron down icon Chevron up icon
Configuring Python with Power BI Chevron down icon Chevron up icon
Solving Common Issues When Using Python and R in Power BI Chevron down icon Chevron up icon
Importing Unhandled Data Objects Chevron down icon Chevron up icon
Using Regular Expressions in Power BI Chevron down icon Chevron up icon
Anonymizing and Pseudonymizing Your Data in Power BI Chevron down icon Chevron up icon
Logging Data from Power BI to External Sources Chevron down icon Chevron up icon
Loading Large Datasets Beyond the Available RAM in Power BI Chevron down icon Chevron up icon
Boosting Data Loading Speed in Power BI with Parquet Format Chevron down icon Chevron up icon
Calling External APIs to Enrich Your Data Chevron down icon Chevron up icon
Calculating Columns Using Complex Algorithms: Distances Chevron down icon Chevron up icon
Calculating Columns Using Complex Algorithms: Fuzzy Matching Chevron down icon Chevron up icon
Calculating Columns Using Complex Algorithms: Optimization Problems Chevron down icon Chevron up icon
Adding Statistical Insights: Associations Chevron down icon Chevron up icon
Adding Statistical Insights: Outliers and Missing Values Chevron down icon Chevron up icon
Using Machine Learning without Premium or Embedded Capacity Chevron down icon Chevron up icon
Using SQL Server External Languages for Advanced Analytics and ML Integration in Power BI Chevron down icon Chevron up icon
Exploratory Data Analysis Chevron down icon Chevron up icon
Using the Grammar of Graphics in Python with plotnine Chevron down icon Chevron up icon
Advanced Visualizations Chevron down icon Chevron up icon
Interactive R Custom Visuals Chevron down icon Chevron up icon
Other Books You May Enjoy Chevron down icon Chevron up icon
Index Chevron down icon Chevron up icon
Appendix 1: Answers Chevron down icon Chevron up icon
Appendix 2: Glossary Chevron down icon Chevron up icon

Customer reviews

Filter icon Filter
Top Reviews
Rating distribution
Empty star icon Empty star icon Empty star icon Empty star icon Empty star icon 0
(0 Ratings)
5 star 0%
4 star 0%
3 star 0%
2 star 0%
1 star 0%

Filter reviews by


No reviews found
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

What is included in a Packt subscription? Chevron down icon Chevron up icon

A subscription provides you with full access to view all Packt and licnesed content online, this includes exclusive access to Early Access titles. Depending on the tier chosen you can also earn credits and discounts to use for owning content

How can I cancel my subscription? Chevron down icon Chevron up icon

To cancel your subscription with us simply go to the account page - found in the top right of the page or at https://subscription.packtpub.com/my-account/subscription - From here you will see the ‘cancel subscription’ button in the grey box with your subscription information in.

What are credits? Chevron down icon Chevron up icon

Credits can be earned from reading 40 section of any title within the payment cycle - a month starting from the day of subscription payment. You also earn a Credit every month if you subscribe to our annual or 18 month plans. Credits can be used to buy books DRM free, the same way that you would pay for a book. Your credits can be found in the subscription homepage - subscription.packtpub.com - clicking on ‘the my’ library dropdown and selecting ‘credits’.

What happens if an Early Access Course is cancelled? Chevron down icon Chevron up icon

Projects are rarely cancelled, but sometimes it's unavoidable. If an Early Access course is cancelled or excessively delayed, you can exchange your purchase for another course. For further details, please contact us here.

Where can I send feedback about an Early Access title? Chevron down icon Chevron up icon

If you have any feedback about the product you're reading, or Early Access in general, then please fill out a contact form here and we'll make sure the feedback gets to the right team. 

Can I download the code files for Early Access titles? Chevron down icon Chevron up icon

We try to ensure that all books in Early Access have code available to use, download, and fork on GitHub. This helps us be more agile in the development of the book, and helps keep the often changing code base of new versions and new technologies as up to date as possible. Unfortunately, however, there will be rare cases when it is not possible for us to have downloadable code samples available until publication.

When we publish the book, the code files will also be available to download from the Packt website.

How accurate is the publication date? Chevron down icon Chevron up icon

The publication date is as accurate as we can be at any point in the project. Unfortunately, delays can happen. Often those delays are out of our control, such as changes to the technology code base or delays in the tech release. We do our best to give you an accurate estimate of the publication date at any given time, and as more chapters are delivered, the more accurate the delivery date will become.

How will I know when new chapters are ready? Chevron down icon Chevron up icon

We'll let you know every time there has been an update to a course that you've bought in Early Access. You'll get an email to let you know there has been a new chapter, or a change to a previous chapter. The new chapters are automatically added to your account, so you can also check back there any time you're ready and download or read them online.

I am a Packt subscriber, do I get Early Access? Chevron down icon Chevron up icon

Yes, all Early Access content is fully available through your subscription. You will need to have a paid for or active trial subscription in order to access all titles.

How is Early Access delivered? Chevron down icon Chevron up icon

Early Access is currently only available as a PDF or through our online reader. As we make changes or add new chapters, the files in your Packt account will be updated so you can download them again or view them online immediately.

How do I buy Early Access content? Chevron down icon Chevron up icon

Early Access is a way of us getting our content to you quicker, but the method of buying the Early Access course is still the same. Just find the course you want to buy, go through the check-out steps, and you’ll get a confirmation email from us with information and a link to the relevant Early Access courses.

What is Early Access? Chevron down icon Chevron up icon

Keeping up to date with the latest technology is difficult; new versions, new frameworks, new techniques. This feature gives you a head-start to our content, as it's being created. With Early Access you'll receive each chapter as it's written, and get regular updates throughout the product's development, as well as the final course as soon as it's ready.We created Early Access as a means of giving you the information you need, as soon as it's available. As we go through the process of developing a course, 99% of it can be ready but we can't publish until that last 1% falls in to place. Early Access helps to unlock the potential of our content early, to help you start your learning when you need it most. You not only get access to every chapter as it's delivered, edited, and updated, but you'll also get the finalized, DRM-free product to download in any format you want when it's published. As a member of Packt, you'll also be eligible for our exclusive offers, including a free course every day, and discounts on new and popular titles.