Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Data-Centric Machine Learning with Python

You're reading from  Data-Centric Machine Learning with Python

Product type Book
Published in Feb 2024
Publisher Packt
ISBN-13 9781804618127
Pages 378 pages
Edition 1st Edition
Languages
Authors (3):
Jonas Christensen Jonas Christensen
Profile icon Jonas Christensen
Nakul Bajaj Nakul Bajaj
Profile icon Nakul Bajaj
Manmohan Gosada Manmohan Gosada
Profile icon Manmohan Gosada
View More author details

Table of Contents (17) Chapters

Preface Part 1: What Data-Centric Machine Learning Is and Why We Need It
Chapter 1: Exploring Data-Centric Machine Learning Chapter 2: From Model-Centric to Data-Centric – ML’s Evolution Part 2: The Building Blocks of Data-Centric ML
Chapter 3: Principles of Data-Centric ML Chapter 4: Data Labeling Is a Collaborative Process Part 3: Technical Approaches to Better Data
Chapter 5: Techniques for Data Cleaning Chapter 6: Techniques for Programmatic Labeling in Machine Learning Chapter 7: Using Synthetic Data in Data-Centric Machine Learning Chapter 8: Techniques for Identifying and Removing Bias Chapter 9: Dealing with Edge Cases and Rare Events in Machine Learning Part 4: Getting Started with Data-Centric ML
Chapter 10: Kick-Starting Your Journey in Data-Centric Machine Learning Index Other Books You May Enjoy

Principle 1 – data should be the center of ML development

As we discussed in Chapter 2, From Model-Centric to Data-Centric – ML’s Evolution, the predominant model-centric approach is lacking in several ways: computing and storage have been commoditized, algorithms have become practically automated and highly data-dependent, models are accessible but less malleable, and deep learning and AutoML tools are available everywhere. But the data? Well, that’s still the wildcard.

Rather than relying on powerful computing and storage environments and sophisticated algorithms that demand excess amounts of data to give us the incremental uplift in model accuracy, a better approach is to be driven by data – specifically, by the data that is available and relevant to the problem at hand.

Data is unique to every company, problem, and situation, and the data-centric paradigm recognizes this by putting the spotlight and development efforts on the data before...

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}