Test-Driven Machine Learning

More Information
Learn
  • Get started with an introduction to test-driven development and familiarize yourself with how to apply these concepts to machine learning
  • Build and test a neural network deterministically, and learn to look for niche cases that cause odd model behaviour
  • Learn to use the multi-armed bandit algorithm to make optimal choices in the face of an enormous amount of uncertainty
  • Generate complex and simple random data to create a wide variety of test cases that can be codified into tests
  • Develop models iteratively, even when using a third-party library
  • Quantify model quality to enable collaboration and rapid iteration
  • Adopt simpler approaches to common machine learning algorithms
  • Take behaviour-driven development principles to articulate test intent
About

Machine learning is the process of teaching machines to remember data patterns, using them to predict future outcomes, and offering choices that would appeal to individuals based on their past preferences.

Machine learning is applicable to a lot of what you do every day. As a result, you can’t take forever to deliver your first iteration of software. Learning to build machine learning algorithms within a controlled test framework will speed up your time to deliver, quantify quality expectations with your clients, and enable rapid iteration and collaboration.

This book will show you how to quantifiably test machine learning algorithms. The very different, foundational approach of this book starts every example algorithm with the simplest thing that could possibly work. With this approach, seasoned veterans will find simpler approaches to beginning a machine learning algorithm. You will learn how to iterate on these algorithms to enable rapid delivery and improve performance expectations.

The book begins with an introduction to test driving machine learning and quantifying model quality. From there, you will test a neural network, predict values with regression, and build upon regression techniques with logistic regression. You will discover how to test different approaches to naïve bayes and compare them quantitatively, along with how to apply OOP (Object-Oriented Programming) and OOP patterns to test-driven code, leveraging SciKit-Learn.

Finally, you will walk through the development of an algorithm which maximizes the expected value of profit for a marketing campaign by combining one of the classifiers covered with the multiple regression example in the book.

Features
  • Build smart extensions to pre-existing features at work that can help maximize their value
  • Quantify your models to drive real improvement
  • Take your knowledge of basic concepts, such as linear regression and Naïve Bayes classification, to the next level and productionalize their models
  • Play what-if games with your models and techniques by following the test-driven exploration process
Page Count 190
Course Length 5 hours 42 minutes
ISBN 9781784399085
Date Of Publication 27 Nov 2015

Authors

Justin Bozonier

Justin Bozonier is a data scientist living in Chicago. He is currently a Senior Data Scientist at GrubHub. He has led the development of their custom analytics platform and also led the development of their first real time split test analysis platform which utilized Bayesian Statistics. In addition he has developed machine learning models for data mining as well as for prototyping product enhancements. Justin's software development expertise has earned him acknowledgements in the books Parallel Programming with Microsoft® .NET as well as Flow-Based Programming, Second Edition. He has also taught a workshop at PyData titled Simplified Statistics through Simulation.

His previous work experience includes being an Actuarial Systems Developer at Milliman, Inc., contracting as a Software Development Engineer II at Microsoft, and working as a Sr. Data Analyst and Lead Developer at Cheezburger Network amongst other experience.