Reader small image

You're reading from  Data-Centric Machine Learning with Python

Product typeBook
Published inFeb 2024
PublisherPackt
ISBN-139781804618127
Edition1st Edition
Right arrow
Authors (3):
Jonas Christensen
Jonas Christensen
author image
Jonas Christensen

Jonas Christensen has spent his career leading data science functions across multiple industries. He is an international keynote speaker, postgraduate educator, and advisor in the fields of data science, analytics leadership, and machine learning and host of the Leaders of Analytics podcast.
Read more about Jonas Christensen

Nakul Bajaj
Nakul Bajaj
author image
Nakul Bajaj

Nakul Bajaj is a data scientist, MLOps engineer, educator and mentor, helping students and junior engineers navigate their data journey. He has a strong passion for MLOps, with a focus on reducing complexity and delivering value from machine learning use-cases in business and healthcare.
Read more about Nakul Bajaj

Manmohan Gosada
Manmohan Gosada
author image
Manmohan Gosada

Manmohan Gosada is a seasoned professional with a proven track record in the dynamic field of data science. With a comprehensive background spanning various data science functions and industries, Manmohan has emerged as a leader in driving innovation and delivering impactful solutions. He has successfully led large-scale data science projects, leveraging cutting-edge technologies to implement transformative products. With a postgraduate degree, he is not only well-versed in the theoretical foundations of data science but is also passionate about sharing insights and knowledge. A captivating speaker, he engages audiences with a blend of expertise and enthusiasm, demystifying complex concepts in the world of data science.
Read more about Manmohan Gosada

View More author details
Right arrow

Understanding data-centric ML

Data-centric ML is the discipline of systematically engineering the data used to build ML and artificial intelligence (AI) systems1.

The data-centric AI and ML movement is grounded in the philosophy that data quality is more important than data volume when it comes to building highly informative models. Put another way, it is possible to achieve more with a small but high-quality dataset than with a large but noisy dataset. For most ML use cases, it is not feasible to build models based on very large datasets, say millions of observations, simply because the volume of data doesn’t exist. In other words, the potential use of ML as a tool to solve certain problems is often ignored on the basis that the available dataset is too small.

But what if we can use ML to solve problems based on much smaller datasets, even down to less than 100 observations? This is one challenge the data-centric movement is attempting to solve through systematic data...

lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Data-Centric Machine Learning with Python
Published in: Feb 2024Publisher: PacktISBN-13: 9781804618127

Authors (3)

author image
Jonas Christensen

Jonas Christensen has spent his career leading data science functions across multiple industries. He is an international keynote speaker, postgraduate educator, and advisor in the fields of data science, analytics leadership, and machine learning and host of the Leaders of Analytics podcast.
Read more about Jonas Christensen

author image
Nakul Bajaj

Nakul Bajaj is a data scientist, MLOps engineer, educator and mentor, helping students and junior engineers navigate their data journey. He has a strong passion for MLOps, with a focus on reducing complexity and delivering value from machine learning use-cases in business and healthcare.
Read more about Nakul Bajaj

author image
Manmohan Gosada

Manmohan Gosada is a seasoned professional with a proven track record in the dynamic field of data science. With a comprehensive background spanning various data science functions and industries, Manmohan has emerged as a leader in driving innovation and delivering impactful solutions. He has successfully led large-scale data science projects, leveraging cutting-edge technologies to implement transformative products. With a postgraduate degree, he is not only well-versed in the theoretical foundations of data science but is also passionate about sharing insights and knowledge. A captivating speaker, he engages audiences with a blend of expertise and enthusiasm, demystifying complex concepts in the world of data science.
Read more about Manmohan Gosada