You're reading from Scikit-learn Cookbook Over 80 recipes for machine learning in Python with scikit-learn

Product type Paperback

Published in Dec 2025

Last Updated in Sep 2025

Publisher Packt

ISBN-13 9781836644453

Length 414 pages

Edition 3rd Edition

Languages

Python

Tools

Scikit-learn

Concepts

Machine Learning

Author (1):

John Sukup

View More author details

Table of Contents (14) Chapters

1. Scikit-learn Cookbook, Third Edition: Over 80 recipes for machine learning in Python with scikit-learn

2. Chapter 1: Common Conventions and API Elements of scikit-learn FREE CHAPTER

3. Chapter 2: Pre-Model Workflow and Data Preprocessing

4. Chapter 3: Dimensionality Reduction Techniques

5. Chapter 4: Building Models with Distance Metrics and Nearest Neighbors

6. Chapter 5: Linear Models and Regularization

7. Chapter 6: Advanced Logistic Regression and Extensions

8. Chapter 7: Support Vector Machines and Kernel Methods

9. Chapter 8: Tree-Based Algorithms and Ensemble Methods

10. Chapter 9: Text Processing and Multiclass Classification

11. Chapter 10: Clustering Techniques

12. Chapter 11: Novelty and Outlier Detection

13. Chapter 12: Cross-Validation and Model Evaluation Techniques

14. Chapter 13: Deploying scikit-learn Models in Production

Introduction to Cross-Validation

Cross-validation is a cornerstone technique for assessing how an ML model will perform on unseen data. Instead of relying on a single train-test split, we divide the dataset into multiple subsets, training and validating the model several times to get a better estimate of its generalization ability. In this recipe, we’ll explore different types of cross-validation, including k-fold and stratified k-fold cross-validation, and walk through how to implement them using scikit-learn.

k-folds

The k in k-fold indicates the number of folds or subsets we’ll split our dataset into. The term k is used similarly in k-means clustering we saw in the previous chapter.

Getting ready

We begin by loading the libraries and dataset we’ll use to demonstrate cross-validation strategies. We’ll use a classification dataset generated by make_classification.

Load the libraries:

import numpy as np
from sklearn.datasets import make_classification
from...

The rest of the chapter is locked

Tech Concepts

Programming languages

Tech Tools

Unlimited access to the largest independent learning library in tech of over 8,000 expert-authored tech books and videos.

Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.

50+ new titles added per month and exclusive early access to books as they are being written.

You're reading from Scikit-learn Cookbook Over 80 recipes for machine learning in Python with scikit-learn

Table of Contents (14) Chapters

Introduction to Cross-Validation

Getting ready

Authors (1)

Personalised recommendations for you

You're reading from Scikit-learn Cookbook Over 80 recipes for machine learning in Python with scikit-learn

Table of Contents (14) Chapters

Introduction to Cross-Validation

Getting ready

Authors (1)

Personalised recommendations for you

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access