Reader small image

You're reading from  Extending Power BI with Python and R - Second Edition

Product typeBook
Published inMar 2024
Reading LevelIntermediate
PublisherPackt
ISBN-139781837639533
Edition2nd Edition
Languages
Right arrow
Author (1)
Luca Zavarella
Luca Zavarella
author image
Luca Zavarella

Luca Zavarella has a rich background as an Azure Data Scientist Associate and Microsoft MVP, with a Computer Engineering degree from the University of L'Aquila. His decade-plus experience spans the Microsoft Data Platform, starting as a T-SQL developer on SQL Server 2000 and 2005, then mastering the full suite of Microsoft Business Intelligence tools (SSIS, SSAS, SSRS), and advancing into data warehousing. Recently, his focus has shifted to advanced analytics, data science, and AI, contributing to the community as a speaker and blogger, especially on Medium. Currently, he leads the Data & AI division at iCubed, and he also holds an honors degree in classical piano from the "Alfredo Casella" Conservatory in L'Aquila.
Read more about Luca Zavarella

Right arrow

The causes of outliers

Before considering any action to be taken on the outliers of a variable, it is necessary to consider what may have caused them. Once the cause is identified, it may be possible to correct the outliers immediately. Here is a possible categorization of the causes of outliers:

  • Data entry errors: There may be an analyst collecting the data who made a mistake in compiling the data. For example, if the analyst is collecting the birth dates of a group of people, the analyst may write 177 instead of 1977. If the dates collected are in the 1900-2100 range, it is easy to correct the outlier created by the data entry error. Other times, it is not possible to recover the correct value.
  • Intentional outliers: Very often, the introduction of “errors” is intentional on the part of the individuals to whom the measurements apply. For example, adolescents typically do not accurately report the amount of alcohol they consume.
  • Data processing...
lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Extending Power BI with Python and R - Second Edition
Published in: Mar 2024Publisher: PacktISBN-13: 9781837639533

Author (1)

author image
Luca Zavarella

Luca Zavarella has a rich background as an Azure Data Scientist Associate and Microsoft MVP, with a Computer Engineering degree from the University of L'Aquila. His decade-plus experience spans the Microsoft Data Platform, starting as a T-SQL developer on SQL Server 2000 and 2005, then mastering the full suite of Microsoft Business Intelligence tools (SSIS, SSAS, SSRS), and advancing into data warehousing. Recently, his focus has shifted to advanced analytics, data science, and AI, contributing to the community as a speaker and blogger, especially on Medium. Currently, he leads the Data & AI division at iCubed, and he also holds an honors degree in classical piano from the "Alfredo Casella" Conservatory in L'Aquila.
Read more about Luca Zavarella