Reader small image

You're reading from  Machine Learning with scikit-learn Quick Start Guide

Product typeBook
Published inOct 2018
Reading LevelIntermediate
PublisherPackt
ISBN-139781789343700
Edition1st Edition
Languages
Right arrow
Author (1)
Kevin Jolly
Kevin Jolly
author image
Kevin Jolly

Kevin Jolly is a formally educated data scientist with a master's degree in data science from the prestigious King's College London. Kevin works as a statistical analyst with a digital healthcare start-up, Connido Limited, in London, where he is primarily involved in leading the data science projects that the company undertakes. He has built machine learning pipelines for small and big data, with a focus on scaling such pipelines into production for the products that the company has built. Kevin is also the author of a book titled Hands-On Data Visualization with Bokeh, published by Packt. He is the editor-in-chief of Linear, a weekly online publication on data science software and products.
Read more about Kevin Jolly

Right arrow

Preface

The fundamental aim of this book is help its readers quickly deploy, optimize, and evaluate every kind of machine learning algorithm that scikit-learn provides in an agile manner.

Readers will learn how to deploy supervised machine learning algorithms, such as logistic regression, k-nearest neighbors, linear regression, Support Vector Machines, Naive Bayes, and tree-based algorithms, in order to solve classification and regression machine learning problems.

Readers will also learn how to deploy unsupervised machine learning algorithms such as the k-means algorithm in order to cluster unlabeled data into groups.

Finally, readers will be provided with different techniques to visually interpret and evaluate the performance of the algorithms that they build.

Who this book is for

This book is for data scientists, software engineers, and people interested in machine learning with a background in Python who would like to understand, implement, and evaluate a wide range of machine learning algorithms using the scikit-learn framework.

What this book covers

Chapter 1, Introducing Machine Learning with scikit-learn, is a brief introduction to the different types of machine learning and its applications.

Chapter 2, Predicting Categories with K-Nearest Neighbors, covers working with and implementing the k-nearest neighbors algorithm to solve classification problems in scikit-learn.

Chapter 3, Predicting Categories with Logistic Regression, explains the workings and implementation of the logistic regression algorithm when solving classification problems in scikit-learn.

Chapter 4, Predicting Categories with Naive Bayes and SVMs, explains the workings and implementation of the Naive Bayes and the Linear Support Vector Machines algorithms when solving classification problems in scikit-learn.

Chapter 5, Predicting Numeric Outcomes with Linear Regression, explains the workings and implementation of the linear regression algorithm when solving regression problems in scikit-learn.

Chapter 6, Classification and Regression with Trees, explains the workings and implementation of tree-based algorithms such as decision trees, random forests, and the boosting and ensemble algorithms when solving classification and regression problems in scikit-learn.

Chapter 7, Clustering Data with Unsupervised Machine Learning, explains the workings and implementation of the k-means algorithm when solving unsupervised problems in scikit-learn.

Chapter 8, Performance Evaluation Methods, contains visual performance evaluation techniques for supervised and unsupervised machine learning algorithms.

To get the most out of this book

To get the most out of this book:

  • Prior knowledge of Python is assumed at a basic level.
  • Jupyter Notebook as a development environment is preferred but not necessary.

Download the example code files

You can download the example code files for this book from your account at www.packt.com. If you purchased this book elsewhere, you can visit www.packt.com/support and register to have the files emailed directly to you.

You can download the code files by following these steps:

  1. Log in or register at www.packt.com.
  2. Select the SUPPORT tab.
  3. Click on Code Downloads and Errata.
  4. Enter the name of the book in the Search box and follow the onscreen instructions.

Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:

  • WinRAR/7-Zip for Windows
  • Zipeg/iZip/UnRarX for Mac
  • 7-Zip/PeaZip for Linux

The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Machine-Learning-with-scikit-learn-Quick-Start-Guide. In case there's an update to the code, it will be updated on the existing GitHub repository.

We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Code in action

Conventions used

There are a number of text conventions used throughout this book.

CodeInText: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: "Mount the downloaded WebStorm-10*.dmg disk image file as another disk in your system."

A block of code is set as follows:

from sklearn.naive_bayes import GaussianNB

#Initializing an NB classifier

nb_classifier = GaussianNB()

#Fitting the classifier into the training data

nb_classifier.fit(X_train, y_train)

#Extracting the accuracy score from the NB classifier

nb_classifier.score(X_test, y_test)
Warnings or important notes appear like this.
Tips and tricks appear like this.

Get in touch

Feedback from our readers is always welcome.

General feedback: If you have questions about any aspect of this book, mention the book title in the subject of your message and email us at customercare@packtpub.com.

Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packt.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details.

Piracy: If you come across any illegal copies of our works in any form on the Internet, we would be grateful if you would provide us with the location address or website name. Please contact us at copyright@packt.com with a link to the material.

If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.

Reviews

Please leave a review. Once you have read and used this book, why not leave a review on the site that you purchased it from? Potential readers can then see and use your unbiased opinion to make purchase decisions, we at Packt can understand what you think about our products, and our authors can see your feedback on their book. Thank you!

For more information about Packt, please visit packt.com.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Machine Learning with scikit-learn Quick Start Guide
Published in: Oct 2018Publisher: PacktISBN-13: 9781789343700
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Kevin Jolly

Kevin Jolly is a formally educated data scientist with a master's degree in data science from the prestigious King's College London. Kevin works as a statistical analyst with a digital healthcare start-up, Connido Limited, in London, where he is primarily involved in leading the data science projects that the company undertakes. He has built machine learning pipelines for small and big data, with a focus on scaling such pipelines into production for the products that the company has built. Kevin is also the author of a book titled Hands-On Data Visualization with Bokeh, published by Packt. He is the editor-in-chief of Linear, a weekly online publication on data science software and products.
Read more about Kevin Jolly