Test Driven Machine Learning: Control your machine learning algorithms using test-driven development to achieve quantifiable milestones

Justin Bozonier

€15.29 ~~€16.99~~

3 (3 Ratings)

eBook Nov 2015 190 pages 1st Edition

Justin Bozonier

€15.29 ~~€16.99~~

3 (3 Ratings)

eBook Nov 2015 190 pages 1st Edition

What do you get with eBook?

Instant access to your Digital eBook purchase

Download this book in EPUB and PDF formats

Access this title in our online reader with advanced features

DRM FREE - Read whenever, wherever and however you want

View table of contents

Preview Book

Description

Machine learning is the process of teaching machines to remember data patterns, using them to predict future outcomes, and offering choices that would appeal to individuals based on their past preferences. Machine learning is applicable to a lot of what you do every day. As a result, you can’t take forever to deliver your first iteration of software. Learning to build machine learning algorithms within a controlled test framework will speed up your time to deliver, quantify quality expectations with your clients, and enable rapid iteration and collaboration. This book will show you how to quantifiably test machine learning algorithms. The very different, foundational approach of this book starts every example algorithm with the simplest thing that could possibly work. With this approach, seasoned veterans will find simpler approaches to beginning a machine learning algorithm. You will learn how to iterate on these algorithms to enable rapid delivery and improve performance expectations. The book begins with an introduction to test driving machine learning and quantifying model quality. From there, you will test a neural network, predict values with regression, and build upon regression techniques with logistic regression. You will discover how to test different approaches to naïve bayes and compare them quantitatively, along with how to apply OOP (Object-Oriented Programming) and OOP patterns to test-driven code, leveraging SciKit-Learn. Finally, you will walk through the development of an algorithm which maximizes the expected value of profit for a marketing campaign by combining one of the classifiers covered with the multiple regression example in the book.

Who is this book for?

This book is intended for data technologists (scientists, analysts, or developers) with previous machine learning experience who are also comfortable reading code in Python. This book is ideal for those looking for a way to deliver results quickly to enable rapid iteration and improvement.

What you will learn

Get started with an introduction to testdriven development and familiarize yourself with how to apply these concepts to machine learning
Build and test a neural network deterministically, and learn to look for niche cases that cause odd model behaviour
Learn to use the multiarmed bandit algorithm to make optimal choices in the face of an enormous amount of uncertainty
Generate complex and simple random data to create a wide variety of test cases that can be codified into tests
Develop models iteratively, even when using a thirdparty library
Quantify model quality to enable collaboration and rapid iteration
Adopt simpler approaches to common machine learning algorithms
Use behaviourdriven development principles to articulate test intent

What do you get with eBook?

Instant access to your Digital eBook purchase

Download this book in EPUB and PDF formats

Access this title in our online reader with advanced features

DRM FREE - Read whenever, wherever and however you want

Frequently bought together

€36.99

€38.99

€20.99

Total € 96.97

Julian Cook Jan 27, 2016

This is a totally out-of-left-field book on machine learning. Not only is it novel, but I think the ideas will become more important over time, as more models move into production scenarios and need to be iteratively changed and improved, whilst not _breaking_ the model that already works.The main idea, as you may have guessed, is to change the model development process from exploring data - followed by trying different models, to a more formal approach of wrapping your data in a test framework and then proceeding to develop the model(s).The initial model (or failing test) can just be a random guesser, after this you continually re-factor your code to improve on the random result or previous iteration of the model (at which point the test passes).This will seem weird to a statistician, who would insist of emphasizing significance tests or (at least) looking at an ROC curve, which this book also does. The point made by the author is that once you enter the real world of running prediction models in production, you need to move to a process of iterative development, where you have some formal guarantee that the new model is better than the last and that you have not somehow fooled yourself into using something that is not much better than random.There are a couple of drawbacks to the whole approach though. The first is the pain of setting up the tests in the first place, especially when you have a lot of data. I thought about setting up some synthetic data to test a new model, but it was honestly quicker to step through an example with the real data to figure out whether the code was working or not.The other issue is with interactive environments, like R and ipython, you can write your code in short sections at the console, then test each section, or model iteration, as you go. This is very efficient and doesn't necessarily work with a 'think up the tests before you do anything' approach.Even if you don't like the ideas, the book itself is still a very good way to approach statistical model development and should be widely read, especially in organizations where multiple people might touch the ML code.

Amazon Verified review

Bill Jan 17, 2017

I have no disagreements whatsoever with the main point of this book--creating a comprehensive suite of tests is a critical step of building quality ML tools. However, when describing this process for various ML models, the tests described by the author are incredibly trivial. If you know nothing about building ML models, then this book could be valuable for you. If you are even a moderately experienced practitioner, this book is a waste of your time. In my opinion, the entire text could be boiled down to the following 3 sentences1. You should have tests for your ML tools2. You should have tests to make sure your tools perform as you expect they would, at a very basic level (as you should do with any code you write)3. You should have tests to make sure your tools meet appropriate performance benchmarks on a handful of appropriate toy problemsNothing proposed by the author is a bad idea, it's just trivial. I learned nothing.

Martin Aug 03, 2020

This book attempts to introduce test-driven development to data scientists.Because of this I had hoped to find answers to questions specific to machinelearning which other books about test-driven development in general do notconsider. My questions (in particular, how to deal with statistical uncertaintyin tests and how to keep the test suite fast) were touched superficially at best.The book could have been a real gem, sitting at the sweet spot where the topicstest-driven development and machine learning intersect. Instead, it tries to bean introductory text in both fields and falls short. There are better introductorybooks specifically for either topic.The quality of the text and code examples is low. The text contains ambiguousformulations, discrepancies, and topics that are introduced but not furtherdiscussed. The code examples are carelessly formatted, contain obvious copy&pastemistakes, and syntax errors. The occasional mistake in a technical book is usuallyno big deal, but their sheer amount made reading this book an annoying experience.All that said, this book could be interesting for a reader who has had priorexposure to the topics being covered and is committed to work through them indetail to elaborate on what the examples show and what is left unsaid.

Test Driven Machine Learning: Control your machine learning algorithms using test-driven development to achieve quantifiable milestones

What do you get with eBook?