Switch to the store?

Data Science Algorithms in a Week - Second Edition

More Information
Learn
  • Understand how to identify a data science problem correctly
  • Implement well-known machine learning algorithms efficiently using Python
  • Classify your datasets using Naive Bayes, decision trees, and random forest with accuracy
  • Devise an appropriate prediction solution using regression
  • Work with time series data to identify relevant data events and trends
  • Cluster your data using the k-means algorithm
About

Machine learning applications are highly automated and self-modifying, and continue to improve over time with minimal human intervention, as they learn from the trained data. To address the complex nature of various real-world data problems, specialized machine learning algorithms have been developed. Through algorithmic and statistical analysis, these models can be leveraged to gain new knowledge from existing data as well.

Data Science Algorithms in a Week addresses all problems related to accurate and efficient data classification and prediction. Over the course of seven days, you will be introduced to seven algorithms, along with exercises that will help you understand different aspects of machine learning. You will see how to pre-cluster your data to optimize and classify it for large datasets. This book also guides you in predicting data based on existing trends in your dataset. This book covers algorithms such as k-nearest neighbors, Naive Bayes, decision trees, random forest, k-means, regression, and time-series analysis.

By the end of this book, you will understand how to choose machine learning algorithms for clustering, classification, and regression and know which is best suited for your problem

Features
  • Use Python and its wide array of machine learning libraries to build predictive models 
  • Learn the basics of the 7 most widely used machine learning algorithms within a week
  • Know when and where to apply data science algorithms using this guide
Page Count 214
Course Length 6 hours 25 minutes
ISBN 9781789806076
Date Of Publication 31 Oct 2018
Mary and her temperature preferences
Implementation of the k-nearest neighbors algorithm
Map of Italy example – choosing the value of k
House ownership – data rescaling
Text classification – using non-Euclidean distances
Text classification – k-NN in higher dimensions
Summary
Problems
Medical tests – basic application of Bayes' theorem
Bayes' theorem and its extension
Playing chess – independent events
Implementation of a Naive Bayes classifier
Playing chess – dependent events
Gender classification – Bayes for continuous random variables
Summary
Problems
Swim preference – representing data using a decision tree
Information theory
ID3 algorithm – decision tree construction
Classifying with a decision tree
Playing chess – analysis with a decision tree
Going shopping – dealing with data inconsistencies
Summary
Problems
Introduction to the random forest algorithm
Swim preference – analysis involving a random forest
Implementation of the random forest algorithm
Playing chess example
Going shopping – overcoming data inconsistencies with randomness and measuring the level of confidence
Summary
Problems
Household incomes – clustering into k clusters
Gender classification – clustering to classify
Implementation of the k-means clustering algorithm
House ownership – choosing the number of clusters
Document clustering – understanding the number of k clusters in a semantic context
Summary
Problems
Fahrenheit and Celsius conversion – linear regression on perfect data
Weight prediction from height – linear regression on real-world data
Gradient descent algorithm and its implementation
Flight time duration prediction based on distance
Ballistic flight analysis – non-linear model
Summary
Problems
Business profits – analyzing trends
Electronics shop's sales – analyzing seasonality
Summary
Problems

Authors

Dávid Natingga

Dávid Natingga graduated with a master's in engineering in 2014 from Imperial College London, specializing in artificial intelligence. In 2011, he worked at Infosys Labs in Bangalore, India, undertaking research into the optimization of machine learning algorithms. In 2012 and 2013, while at Palantir Technologies in USA, he developed algorithms for big data. In 2014, while working as a data scientist at Pact Coffee, London, he created an algorithm suggesting products based on the taste references of customers and the structures of the coffees. In order to use pure mathematics to advance the field of AI, he is a PhD candidate in Computability Theory at the University of Leeds, UK. In 2016, he spent 8 months at Japan's Advanced Institute of Science and Technology as a research visitor.