Search icon
Subscription
0
Cart icon
Close icon
You have no products in your basket yet
Save more on your purchases!
Savings automatically calculated. No voucher code required
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Scala Machine Learning Projects

You're reading from  Scala Machine Learning Projects

Product type Book
Published in Jan 2018
Publisher Packt
ISBN-13 9781788479042
Pages 470 pages
Edition 1st Edition
Languages

Table of Contents (17) Chapters

Title Page
Packt Upsell
Contributors
Preface
1. Analyzing Insurance Severity Claims 2. Analyzing and Predicting Telecommunication Churn 3. High Frequency Bitcoin Price Prediction from Historical and Live Data 4. Population-Scale Clustering and Ethnicity Prediction 5. Topic Modeling - A Better Insight into Large-Scale Texts 6. Developing Model-based Movie Recommendation Engines 7. Options Trading Using Q-learning and Scala Play Framework 8. Clients Subscription Assessment for Bank Telemarketing using Deep Neural Networks 9. Fraud Analytics Using Autoencoders and Anomaly Detection 10. Human Activity Recognition using Recurrent Neural Networks 11. Image Classification using Convolutional Neural Networks 1. Other Books You May Enjoy Index

Machine learning and learning workflow


Machine learning (ML) is about using a set of statistical and mathematical algorithms to perform tasks such as concept learning, predictive modeling, clustering, and mining useful patterns can be performed. The ultimate goal is to improve the learning in such a way that it becomes automatic, so that no more human interactions are needed, or to reduce the level of human interaction as much as possible.

We now refer to a famous definition of ML by Tom M. Mitchell (Machine Learning, Tom Mitchell, McGraw Hill, 1997), where he explained what learning really means from a computer science perspective:

"A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E."

Based on the preceding definition, we can conclude that a computer program or machine can do the following:

  • Learn from data and histories
  • Be improved with experience
  • Interactively enhance a model that can be used to predict an outcome

A typical ML function can be formulated as a convex optimization problem for finding a minimizer of a convex function f that depends on a variable vector w (weights), which has d records. Formally, we can write this as the following optimization problem:

Here, the objective function is of the form:

Here, the vectors

are the training data points for 1≤i≤n, and are their corresponding labels that we want to predict eventually. We call the method linear if L(w;x,y) can be expressed as a function of wTx and y.

The objective function f has two components:

  • A regularizer that controls the complexity of the model
  • The loss that measures the error of the model on the training data

The loss function L(w;) is typically a convex function in w. The fixed regularization parameter λ≥0 defines the trade-off between the two goals of minimizing the loss on the training error and minimizing model complexity to avoid overfitting. Throughout the chapters, we will learn in details on different learning types and algorithms.

On the other hand, deep neural networks (DNN) form the core of deep learning (DL) by providing algorithms to model complex and high-level abstractions in data and can better exploit large-scale datasets to build complex models

There are some widely used deep learning architectures based on artificial neural networks: DNNs, Capsule Networks, Restricted Boltzmann Machines, deep belief networks, factorization machines and recurrent neural networks.

These architectures have been widely used in computer vision, speech recognition, natural language processing, audio recognition, social network filtering, machine translation, bioinformatics and drug design. Throughout the chapters, we will see several real-life examples using these architectures to achieve state-of-the art predictive accuracy.

Typical machine learning workflow

A typical ML application involves several processing steps, from the input to the output, forming a scientific workflow as shown in Figure 1, ML workflow. The following steps are involved in a typical ML application:

  1. Load the data
  2. Parse the data into the input format for the algorithm
  3. Pre-process the data and handle the missing values
  4. Split the data into three sets, for training, testing, and validation (train set and validation set respectively) and one for testing the model (test dataset)
  5. Run the algorithm to build and train your ML model
  6. Make predictions with the training data and observe the results
  7. Test and evaluate the model with the test data or alternatively validate the model using some cross-validator technique using the third dataset called a validation dataset
  8. Tune the model for better performance and accuracy
  9. Scale up the model so that it can handle massive datasets in future
  10. Deploy the ML model in production:

Figure 1: ML workflow

The preceding workflow is represent a few steps to solve ML problems. Where, ML tasks can be broadly categorized into supervised, unsupervised, semi-supervised, reinforcement, and recommendation systems. The following Figure 2, Supervised learning in action, shows the schematic diagram of supervised learning. After the algorithm has found the required patterns, those patterns can be used to make predictions for unlabeled test data:

Figure 2: Supervised learning in action

Examples include classification and regression for solving supervised learning problems so that predictive models can be built for predictive analytics based on them. Throughout the upcoming chapters, we will provide several examples of supervised learning, such as LR, logistic regression, random forest, decision trees, Naive Bayes, multilayer perceptron, and so on.

A regression algorithm is meant to produce continuous output. The input is allowed to be either discrete or continuous:

Figure 3: A regression algorithm is meant to produce continuous output

A classification algorithm, on the other hand, is meant to produce discrete output from an input of a set of discrete or continuous values. This distinction is important to know because discrete-valued outputs are handled better by classification, which will be discussed in upcoming chapters:

Figure 4: A classification algorithm is meant to produce discrete output

In this chapter, we will mainly focus on the supervised regression algorithms. We will start with describing the problem statement and then we move on to the very simple LR algorithm. Often, performance of these ML models is optimized using hyperparameter tuning and cross-validation techniques. So knowing them, in brief, is mandatory so that we can easily use them in future chapters.

You have been reading a chapter from
Scala Machine Learning Projects
Published in: Jan 2018 Publisher: Packt ISBN-13: 9781788479042
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}