scikit-learn Cookbook

More Information
Learn
  • Address algorithms of various levels of complexity and learn how to analyze data at the same time
  • Handle common data problems such as feature extraction and missing data
  • Understand how to evaluate your models against themselves and any other model
  • Discover just enough math needed to learn how to think about the connections between various algorithms
  • Customize the machine learning algorithm to fit your problem, and learn how to modify it when the situation calls for it
  • Incorporate other packages from the Python ecosystem to munge and visualize your dataset
About

Python is quickly becoming the go-to language for analysts and data scientists due to its simplicity and flexibility, and within the Python data space, scikit-learn is the unequivocal choice for machine learning. Its consistent API and plethora of features help solve any machine learning problem it comes across.

The book starts by walking through different methods to prepare your data—be it a dataset with missing values or text columns that require the categories to be turned into indicator variables. After the data is ready, you'll learn different techniques aligned with different objectives—be it a dataset with known outcomes such as sales by state, or more complicated problems such as clustering similar customers. Finally, you'll learn how to polish your algorithm to ensure that it's both accurate and resilient to new datasets.

Features
  • Learn how to handle a variety of tasks with Scikit-Learn with interesting recipes that show you how the library really works
  • Use Scikit-Learn to simplify the programming side data so you can focus on thinking
  • Discover how to apply algorithms in a variety of situations
Page Count 214
Course Length 6 hours 25 minutes
ISBN 9781783989485
Date Of Publication 4 Nov 2014

Authors

Trent Hauck

Trent Hauck is a data scientist living and working in the Seattle area. He grew up in Wichita, Kansas and received his undergraduate and graduate degrees from the University of Kansas. He is the author of the book Instant Data Intensive Apps with pandas How-to, Packt Publishing—a book that can get you up to speed quickly with pandas and other associated technologies.