Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Extending Power BI with Python and R - Second Edition

You're reading from  Extending Power BI with Python and R - Second Edition

Product type Book
Published in Mar 2024
Publisher Packt
ISBN-13 9781837639533
Pages 814 pages
Edition 2nd Edition
Languages
Author (1):
Luca Zavarella Luca Zavarella
Profile icon Luca Zavarella

Table of Contents (27) Chapters

Preface Where and How to Use R and Python Scripts in Power BI Configuring R with Power BI Configuring Python with Power BI Solving Common Issues When Using Python and R in Power BI Importing Unhandled Data Objects Using Regular Expressions in Power BI Anonymizing and Pseudonymizing Your Data in Power BI Logging Data from Power BI to External Sources Loading Large Datasets Beyond the Available RAM in Power BI Boosting Data Loading Speed in Power BI with Parquet Format Calling External APIs to Enrich Your Data Calculating Columns Using Complex Algorithms: Distances Calculating Columns Using Complex Algorithms: Fuzzy Matching Calculating Columns Using Complex Algorithms: Optimization Problems Adding Statistical Insights: Associations Adding Statistical Insights: Outliers and Missing Values Using Machine Learning without Premium or Embedded Capacity Using SQL Server External Languages for Advanced Analytics and ML Integration in Power BI Exploratory Data Analysis Using the Grammar of Graphics in Python with plotnine Advanced Visualizations Interactive R Custom Visuals Other Books You May Enjoy
Index
Appendix 1: Answers
Appendix 2: Glossary

Implementing outlier detection algorithms

The first thing you’ll do is implement what you’ve just learned in Python.

Implementing outlier detection in Python

In this section, we will use the Wine Quality dataset created by Paulo Cortez et al. (https://archive.ics.uci.edu/ml/datasets/wine+quality) to show how to detect outliers in Python. The dataset contains as many observations as there are different types of red wine, each described by the organoleptic properties measured by the variables, except for the quality variable, which provides a measure of the quality of the product using a discrete grade scale from 1 to 10.

You’ll find the code used in this section in the Python\01-detect-outliers-in-python.py file in the Chapter 16 folder.

Once you have loaded the data from the winequality-red.csv file directly from the web into the df variable, let’s start by examining the sulphates variable. Let’s check if it contains any outliers...

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}