Reader small image

You're reading from  Active Machine Learning with Python

Product typeBook
Published inMar 2024
PublisherPackt
ISBN-139781835464946
Edition1st Edition
Right arrow
Author (1)
Margaux Masson-Forsythe
Margaux Masson-Forsythe
author image
Margaux Masson-Forsythe

Margaux Masson-Forsythe is a skilled machine learning engineer and advocate for advancements in surgical data science and climate AI. As the Director of Machine Learning at Surgical Data Science Collective, she builds computer vision models to detect surgical tools in videos and track procedural motions. Masson-Forsythe manages a multidisciplinary team and oversees model implementation, data pipelines, infrastructure, and product delivery. With a background in computer science and expertise in machine learning, computer vision, and geospatial analytics, she has worked on projects related to reforestation, deforestation monitoring, and crop yield prediction.
Read more about Margaux Masson-Forsythe

Right arrow

Preface

Welcome to Active Learning with Python a comprehensive guide designed to introduce you to the power of active machine learning. This book is written with the conviction that while data is plentiful, its quality and relevance hold the key to building models that are not only efficient but also robust and insightful.

Active machine learning is a method used in machine learning where the algorithm can query an oracle to label new data points with the desired outputs. It stands at the crossroads of optimization and human-computer interaction, enabling machines to learn more effectively with less data. This is particularly valuable in scenarios where data labeling is costly, time-consuming, or requires expert knowledge.

Throughout this book, we leverage Python, a leading programming language in the field of data science and machine learning, known for its simplicity and powerful libraries. Python serves as an excellent medium for exploring the concepts of active machine learning, providing both beginners and experienced practitioners with the tools needed to implement sophisticated models.

Who this book is for

This book is intended for data scientists, machine learning engineers, researchers, and anyone curious about optimizing machine learning workflows. Whether you are new to active machine learning or looking to enhance your current models, this book provides insights into making the most of your data through strategic querying and learning techniques.

What this book covers

Chapter 1, Introducing Active Machine Learning, explores the fundamental principles of active machine learning, a highly effective approach that significantly differs from passive methods. This chapter also offers insights into its distinctive strategies and advantages.

Chapter 2, Designing Query Strategy Frameworks, presents a comprehensive exploration of the most effective and widely utilized query strategy frameworks in active machine learning and covers uncertainty sampling, query-by-committee, expected model change, expected error reduction, and density-weighted methods.

Chapter 3, Managing the Human in the Loop, discusses the best practices and techniques for the design of interactive active machine learning systems, with an emphasis on optimizing human-in-the-loop labeling. Aspects such as labeling interface design, the crafting of effective workflows, strategies for resolving model-label disagreements, the selection of suitable labelers, and their efficient management are covered.

Chapter 4, Applying Active Learning to Computer Vision, covers various techniques for harnessing the power of active machine learning to enhance computer vision model performance in tasks such as image classification, object detection, and semantic segmentation, also addressing the challenges in their application.

Chapter 5, Leveraging Active Learning for Big Data, explores the active machine learning techniques for managing big data such as videos, and acknowledges the challenges in developing video analysis models due to their large size and frequent data duplication based on frames-per-second rates, with a demonstration of an active machine learning method for selecting the most informative frames for labeling.

Chapter 6, Evaluating and Enhancing Efficiency, details the evaluation of active machine learning systems, encompassing metrics, automation, efficient labeling, testing, monitoring, and stopping criteria, aiming for accurate evaluations and insights into system efficiency, guiding informed improvements in the field.

Chapter 7, Utilizing Tools and Packages for Active ML, discusses the Python libraries, frameworks, and tools commonly used for active learning, highlighting their value in implementing various active learning techniques and offering an overview suitable for both beginners and experienced programmers.

To get the most out of this book

You should possess proficiency in Python coding and familiarity with Google Colab, alongside a foundational understanding of machine learning and deep learning principles.You also need to be familiar with machine learning frameworks like PyTorch.

This book is for individuals who possess a fundamental understanding of machine learning and deep learning and who aim to acquire knowledge about active learning in order to optimize the annotation process of their machine learning datasets. This optimization will enable them to train the most effective models possible.

Software covered in the book

Python packages: scikit-learn, matplotlib, numpy, datasets, transformers, huggingface_hub, torch, pandas, torchvision, roboflow, tqdm, glob, pyyaml, opencv-python, ultralytics, lightly, docker, encord, clearml, pymongo, and modAL-python

Jupyter or Google Colab notebook (with Python version 3.10.12 and above)

You will need to create accounts for diverse tools: Encord, Roboflow, and Lightly. You will also need access to an AWS EC2 instance for Chapter 6, Evaluating and Enhancing Efficiency.

If you are using the digital version of this book, we advise you to type the code yourself or access the code from the book’s GitHub repository (a link is available in the next section). Doing so will help you avoid any potential errors related to the copying and pasting of code.

Download the example code files

You can download the example code files for this book from GitHub at https://github.com/PacktPublishing/Active-Machine-Learning-with-Python. If there’s an update to the code, it will be updated in the GitHub repository.

We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Conventions used

There are a number of text conventions used throughout this book.

Code in text: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: “We define x_true and y_true.”

A block of code is set as follows:

y_true = np.array(small_dataset['label'])
x_true = np.array(small_dataset['text'])

Bold: Indicates a new term, an important word, or words that you see onscreen. For instance, words in menus or dialog boxes appear in bold. Here is an example: “Anomaly detection is another domain where active learning proves to be highly effective.”

Tips or important notes

Appear like this.

Get in touch

Feedback from our readers is always welcome.

General feedback: If you have questions about any aspect of this book, email us at customercare@packtpub.com and mention the book title in the subject of your message.

Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/support/errata and fill in the form.

Piracy: If you come across any illegal copies of our works in any form on the internet, we would be grateful if you would provide us with the location address or website name. Please contact us at copyright@packt.com with a link to the material.

If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.

Share Your Thoughts

Once you’ve read Active Machine Learning with Python, we’d love to hear your thoughts! Please click here to go straight to the Amazon review page for this book and share your feedback.

Your review is important to us and the tech community and will help us make sure we’re delivering excellent quality content.

Download a free PDF copy of this book

Thanks for purchasing this book!

Do you like to read on the go but are unable to carry your print books everywhere?

Is your eBook purchase not compatible with the device of your choice?

Don’t worry, now with every Packt book you get a DRM-free PDF version of that book at no cost.

Read anywhere, any place, on any device. Search, copy, and paste code from your favorite technical books directly into your application.

The perks don’t stop there, you can get exclusive access to discounts, newsletters, and great free content in your inbox daily

Follow these simple steps to get the benefits:

  1. Scan the QR code or visit the link below

https://packt.link/free-ebook/9781835464946

  1. Submit your proof of purchase
  2. That’s it! We’ll send your free PDF and other benefits to your email directly
lock icon
The rest of the chapter is locked
You have been reading a chapter from
Active Machine Learning with Python
Published in: Mar 2024Publisher: PacktISBN-13: 9781835464946
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Margaux Masson-Forsythe

Margaux Masson-Forsythe is a skilled machine learning engineer and advocate for advancements in surgical data science and climate AI. As the Director of Machine Learning at Surgical Data Science Collective, she builds computer vision models to detect surgical tools in videos and track procedural motions. Masson-Forsythe manages a multidisciplinary team and oversees model implementation, data pipelines, infrastructure, and product delivery. With a background in computer science and expertise in machine learning, computer vision, and geospatial analytics, she has worked on projects related to reforestation, deforestation monitoring, and crop yield prediction.
Read more about Margaux Masson-Forsythe