Reader small image

You're reading from  Practical Guide to Applied Conformal Prediction in Python

Product typeBook
Published inDec 2023
PublisherPackt
ISBN-139781805122760
Edition1st Edition
Right arrow
Author (1)
Valery Manokhin
Valery Manokhin
author image
Valery Manokhin

Valeriy Manokhin is the leading expert in the field of machine learning and Conformal Prediction. He holds a Ph.D.in Machine Learning from Royal Holloway, University of London. His doctoral work was supervised by the creator of Conformal Prediction, Vladimir Vovk, and focused on developing new methods for quantifying uncertainty in machine learning models. Valeriy has published extensively in leading machine learning journals, and his Ph.D. dissertation 'Machine Learning for Probabilistic Prediction' is read by thousands of people across the world. He is also the creator of "Awesome Conformal Prediction," the most popular resource and GitHub repository for all things Conformal Prediction.
Read more about Valery Manokhin

Right arrow

Preface

Embark on an insightful journey with “Practical Guide to Applied Conformal Prediction in Python,” your comprehensive guide to mastering uncertainty quantification in machine learning. This book unfolds the complexities of Conformal Prediction, focusing on practical applications that span classification, regression, forecasting, computer vision, and natural language processing. It also delves into sophisticated techniques for addressing imbalanced datasets and multi-class classification challenges, presenting case studies that bridge theory with real-world practice.

This resource is meticulously crafted for a diverse readership, including data scientists, machine learning engineers, industry professionals, researchers, academics, and students interested in mastering uncertainty quantification and conformal prediction within their respective fields.

Whether you’re starting your journey in data science or looking to deepen your existing expertise, this book provides the foundational knowledge and advanced strategies necessary to navigate uncertainty quantification in machine learning confidently.

With “Practical Guide to Applied Conformal Prediction in Python,” you gain more than knowledge; you gain the power to apply cutting-edge techniques to industry applications, enhancing the precision and reliability of your predictive models. Embrace this opportunity to elevate your career in machine learning by harnessing the potential of Conformal Prediction.

Who is this book for?

This publication is a must-read for those fascinated by Conformal Prediction, catering to a broad spectrum of professionals and learners. It is specifically designed for data scientists, machine learning engineers, educators and scholars, research professionals, application developers, students with a zest for data, analytical experts, and statisticians dedicated to expanding their knowledge.

What this book covers

Chapter 1, Introducing Conformal Prediction, The opening chapter of Practical Guide to Applied Conformal Prediction in Python serves as a fundamental introduction to the book’s core theme—Conformal Prediction. It lays the foundation by elucidating Conformal Prediction’s purpose as a robust framework for effectively quantifying prediction uncertainty and enhancing trust in machine learning models.

Within this chapter, we embark on a journey through the historical evolution and the burgeoning acclaim of this transformative framework. Key concepts and principles that underpin Conformal Prediction are explored, shedding light on its manifold advantages. The chapter underscores how Conformal Prediction stands apart from conventional machine learning techniques. It achieves this distinction by furnishing prediction regions and confidence measures, all underpinned by finite sample validity guarantees, all while eschewing the need for restrictive distributional assumptions.

Chapter 2, Overview of Conformal Prediction, In the second chapter of Practical Guide to Applied Conformal Prediction in Python, we embark on a comprehensive journey into the realm of Conformal Prediction, focusing on its pivotal role in quantifying prediction uncertainty.

This chapter commences by addressing the crucial need for quantifying uncertainty in predictions and introduces the concepts of aleatoric and epistemic uncertainty. It emphasizes the distinct advantages offered by Conformal Prediction in comparison to conventional statistical, Bayesian, and fuzzy logic methods. These advantages include the assurance of coverage, freedom from distributional constraints, and compatibility with a wide array of machine learning models.

A significant portion of the chapter is devoted to elucidating how Conformal Prediction operates in a classification context. It unveils the intricate process of using nonconformity scores to gauge the alignment between predictions and the training data distribution. These scores are then transformed into p-values and confidence levels, forming the foundation for constructing prediction sets.

Chapter 2 provides readers with a deep understanding of Conformal Prediction’s principles and its profound significance in quantifying uncertainty. This knowledge proves particularly invaluable in critical applications where dependable confidence estimates must accompany predictions, enhancing the overall trustworthiness of the outcomes.

Chapter 3, Fundamentals of Conformal Prediction, dives into the fundamentals and mathematical foundations underlying Conformal Prediction. It explains the basic components like nonconformity measures, calibration sets, and the prediction process.

It covers different types of nonconformity measures for classification and regression, explaining their strengths and weaknesses. Popular choices like hinge loss, margin, and normalized error are discussed.

The chapter illustrates how to compute nonconformity scores, p-values, confidence levels, and credibility levels with examples. It also explains the role of calibration sets, online vs offline conformal prediction, and unconditional vs conditional coverage.

Overall, Chapter 3 equips readers with a strong grasp of the core concepts and mathematical workings of Conformal Prediction. By mastering these foundations, practitioners can apply Conformal Prediction to enhance the reliability of predictions across various machine learning tasks.

Chapter 4, Validity and Efficiency of Conformal Prediction, extends the concepts introduced in the previous chapter and delves into the crucial notions of validity and efficiency. Through practical examples, readers will discover the importance of accurate (unbiased) prediction models.

This chapter will delve into the definitions, metrics, and real-world instances of valid and efficient models. We’ll also explore the inherent validity guarantees offered by conformal prediction. By the chapter’s conclusion, you’ll possess the knowledge needed to evaluate and enhance the validity and efficiency of your predictive models, opening doors to more dependable and impactful applications in your respective fields.

Chapter 5, Types of Conformal Predictors, This chapter explores the intriguing realm of conformal predictors, exploring their various types and unique attributes. Key topics covered encompass the foundational principles of conformal prediction and its relevance in machine learning. The chapter explains both classical transductive and inductive conformal predictors, guiding readers in selecting the right type of conformal predictor tailored to their specific problem needs. Additionally, practical use cases of conformal predictors in binary classification, multiclass classification, and regression are also presented.

Chapter 6, Conformal Prediction for Classification, This chapter explores different types of conformal predictors for quantifying uncertainty in machine learning predictions. It covers the foundations of classical Transductive Conformal Prediction (TCP) and the more efficient Inductive Conformal Prediction (ICP). TCP leverages the full dataset for training but requires model retraining for each new prediction. ICP splits data into training and calibration sets, achieving computational speedup by training once. Tradeoffs between the variants are discussed.

The chapter provides algorithmic descriptions for applying TCP and ICP to classification and regression problems. It steps through calculating nonconformity scores, p-values, and prediction regions in detail using code examples.

Guidelines are given on choosing the right conformal predictor based on factors like data size, real-time requirements, and computational constraints. Example use cases illustrate when TCP or ICP would be preferable.

We also introduce specialized techniques within conformal prediction called Venn-ABERS predictors.

Overall, the chapter offers readers a solid grasp of the different types of conformal predictors available and how to select the optimal approach based on the problem context.

Chapter 7, Conformal Prediction for Regression, This chapter provides a comprehensive guide to uncertainty quantification for regression problems using Conformal Prediction. It covers the need for uncertainty quantification, techniques for generating prediction intervals, Conformal Prediction frameworks tailored for regression, and advanced methods like Conformalized Quantile Regression, Jackknife+ and Conformal Predictive Distributions. Readers will learn the theory and practical application of Conformal Prediction for producing valid, calibrated prediction intervals and distributions. The chapter includes detailed explanations and code illustrations using real-world housing price data and Python libraries to give hands-on experience applying these methods. Overall, readers will gain the knowledge to reliably quantify uncertainty and construct well-calibrated prediction regions for regression problems.

Chapter 8, Conformal Prediction for Time Series and Forecasting, This chapter is dedicated to the application of Conformal Prediction in the realm of time series forecasting.

The chapter initiates with an exploration of the significance of uncertainty quantification in forecasting, emphasizing the concept of prediction intervals. It covers diverse approaches for generating prediction intervals, encompassing parametric methods, non-parametric techniques like bootstrapping, Bayesian approaches, and Conformal Prediction.

Practical implementations of Conformal Prediction for time series are showcased using libraries such as Amazon Fortuna (EnbPI method), Nixtla (statsforecast package), and NeuralProphet. Code examples are provided, illustrating the generation of prediction intervals and the evaluation of validity.

In essence, Chapter 8 equips readers with practical tools and techniques to leverage Conformal Prediction for creating reliable and well-calibrated prediction intervals in time series forecasting models. By incorporating these methods, forecasters can effectively quantify uncertainty and bolster the robustness of their forecasts.

Chapter 9, Conformal Prediction for Computer Vision, In this chapter, we embark on a journey through the application of Conformal Prediction in the realm of computer vision.

The chapter commences by underscoring the paramount significance of uncertainty quantification in vision tasks, particularly in domains with safety-critical implications like medical imaging and autonomous driving. It addresses a common challenge in modern deep learning—overconfident and miscalibrated predictions.

Diverse uncertainty quantification methods are explored before highlighting the unique advantages of Conformal Prediction, including its distribution-free guarantees.

Practical applications of Conformal Prediction in image classification are vividly demonstrated, with a focus on the RAPS algorithm, renowned for generating compact and stable prediction sets. The chapter provides code examples illustrating the construction of classifiers with well-calibrated prediction sets on ImageNet data, employing various Conformal Prediction approaches.

In essence, Chapter 9 equips readers with an understanding of the value of uncertainty quantification in computer vision systems. It offers hands-on experience in harnessing Conformal Prediction to craft dependable image classifiers complete with valid confidence estimates.

Chapter 10, Conformal Prediction for Natural Language Processing, This chapter ventures into the realm of uncertainty quantification in Natural Language Processing (NLP), leveraging the power of Conformal Prediction.

The chapter commences by delving into the inherent ambiguity that characterizes language and the repercussions of miscalibrated predictions stemming from intricate deep learning models.

Various approaches to uncertainty quantification, such as Bayesian methods, bootstrapping, and out-of-distribution detection, are thoughtfully compared. The mechanics of applying conformal prediction to NLP are demystified, encompassing the computation of nonconformity scores and p-values.

The advantages of adopting conformal prediction for NLP are eloquently outlined, including distribution-free guarantees, interpretability, and adaptivity. The chapter also delves into contemporary research, highlighting how conformal prediction enhances reliability, safety, and trust in large language models.

Chapter 11, Handling Imbalanced Data, This chapter explores solutions for the common machine learning challenge of imbalanced data, where one class heavily outweighs others. It explains why this skewed distribution poses complex problems for predictive modeling.

The chapter compares various traditional approaches like oversampling and SMOTE, noting their pitfalls regarding poor model calibration. It then introduces Conformal Prediction as an innovative method to handle imbalanced data without compromising reliability.

Through code examples on a real-world credit card fraud detection dataset, the chapter demonstrates applying conformal prediction for probability calibration even with highly skewed data. Readers will learn best practices for tackling imbalance issues while ensuring decision-ready probabilistic forecasting.

Chapter 12, Multi-Class Conformal Prediction, This final chapter explores multi-class classification and how conformal prediction can be applied to problems with more than two outcome categories. It covers evaluation metrics like precision, recall, F1 score, log loss, and Brier score for assessing model performance.

The chapter explains techniques to extend binary classification algorithms like support vector machines or neural networks to multi-class contexts using one-vs-all and one-vs-one strategies.

It then demonstrates how conformal prediction can provide prediction sets or intervals for each class with validity guarantees. Advanced methods like Venn-ABERS predictors for multi-class probability estimation are also introduced.

Through code examples, the chapter shows how to implement inductive conformal prediction on multi-class problems, outputting predictions with credibility and confidence scores. Readers will learn best practices for applying conformal prediction to classification tasks with multiple potential classes.

To get the most out of this book

You will need a working Python environment on your computer. We recommend using Python 3.6 or later.

Ensure that you have essential libraries, such as scikit-learn, NumPy, and Matplotlib, installed. If not, you can easily install them using Conda or pip.

The notebooks can be run both locally or by using Google Colab (https://colab.research.google.com).

Software/hardware covered in the book

Operating system requirements

Python

Windows, macOS, or Linux

Colab (to run notebooks in Google Cloud)

Windows, macOS, or Linux

MAPIE

Windows, macOS, or Linux

Amazon Fortuna

Windows, macOS, or Linux

NIxtla statsforecast

Windows, macOS, or Linux

NeuralProphet

Windows, macOS, or Linux

If you are using the digital version of this book, we advise you to access the code from the book’s GitHub repository (a link is available in the next section). Doing so will help you avoid any potential errors related to the copying and pasting of code.

Download the example code files

You can download the example code files for this book from GitHub at https://github.com/PacktPublishing/Practical-Guide-to-Applied-Conformal-Prediction. If there’s an update to the code, it will be updated in the GitHub repository.

We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Conventions used

There are a number of text conventions used throughout this book.

Code in text: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: First, we must create ICP classifiers by using a wrapper from nonconformist

A block of code is set as follows:

y_pred_calib = model.predict(X_calib)
y_pred_score_calib = model.predict_proba(X_calib)
y_pred_test = model.predict(X_test)
y_pred_score_test = model.predict_proba(X_test)

Tips or important notes

Appear like this.

Get in touch

Feedback from our readers is always welcome.

General feedback: If you have questions about any aspect of this book, email us at customercare@packtpub.com and mention the book title in the subject of your message.

Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/support/errata and fill in the form.

Piracy: If you come across any illegal copies of our works in any form on the internet, we would be grateful if you would provide us with the location address or website name. Please contact us at copyright@packt.com with a link to the material.

If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.

Share Your Thoughts

Once you’ve read Practical Guide to Applied Conformal Prediction in Python, we’d love to hear your thoughts! Please click here to go straight to the Amazon review page for this book and share your feedback.

Your review is important to us and the tech community and will help us make sure we’re delivering excellent quality content.

Download a free PDF copy of this book

Thanks for purchasing this book!

Do you like to read on the go but are unable to carry your print books everywhere? Is your eBook purchase not compatible with the device of your choice?

Don’t worry, now with every Packt book you get a DRM-free PDF version of that book at no cost.

Read anywhere, any place, on any device. Search, copy, and paste code from your favorite technical books directly into your application.

The perks don’t stop there, you can get exclusive access to discounts, newsletters, and great free content in your inbox daily

Follow these simple steps to get the benefits:

  1. Scan the QR code or visit the link below

https://packt.link/free-ebook/9781805122760

  1. Submit your proof of purchase
  2. That’s it! We’ll send your free PDF and other benefits to your email directly
lock icon
The rest of the chapter is locked
You have been reading a chapter from
Practical Guide to Applied Conformal Prediction in Python
Published in: Dec 2023Publisher: PacktISBN-13: 9781805122760
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Valery Manokhin

Valeriy Manokhin is the leading expert in the field of machine learning and Conformal Prediction. He holds a Ph.D.in Machine Learning from Royal Holloway, University of London. His doctoral work was supervised by the creator of Conformal Prediction, Vladimir Vovk, and focused on developing new methods for quantifying uncertainty in machine learning models. Valeriy has published extensively in leading machine learning journals, and his Ph.D. dissertation 'Machine Learning for Probabilistic Prediction' is read by thousands of people across the world. He is also the creator of "Awesome Conformal Prediction," the most popular resource and GitHub repository for all things Conformal Prediction.
Read more about Valery Manokhin