Chapter 1: Scala and Data Science

Programming in data science

Chapter 2: Manipulating Data with Breeze

An example – logistic regression

Chapter 3: Plotting with breeze-viz

Customizing the line type

More advanced scatter plots

Multi-plot example – scatterplot matrix plots

Managing without documentation

Data visualization beyond breeze-viz

Chapter 4: Parallel Collections and Futures

Chapter 5: Scala and SQL through JDBC

Functional wrappers for JDBC

Safer JDBC connections with the loan pattern

Enriching JDBC statements with the "pimp my library" pattern

Wrapping result sets in a stream

Looser coupling with type classes

Creating a data access layer

Chapter 6: Slick – A Functional Interface for SQL

Aggregations with "Group by"

Accessing database metadata

Chapter 7: Web APIs

JSON in Scala – an exercise in pattern matching

Extraction using case classes

Concurrency and exception handling with futures

Authentication – adding HTTP headers

Chapter 8: Scala and MongoDB

Connecting to MongoDB with Casbah

Extracting objects from the database

Custom type serialization

Chapter 9: Concurrency with Akka

Message passing between actors

Queue control and the pull pattern

Accessing the sender of a message

Custom supervisor strategies

What we have not talked about

Chapter 10: Distributed Batch Processing with Spark

Acquiring the example data

Resilient distributed datasets

Building and running standalone programs

Data shuffling and partitions

Chapter 11: Spark SQL and DataFrames

DataFrames – a whirlwind introduction

Joining DataFrames together

Custom functions on DataFrames

DataFrame immutability and persistence

SQL statements on DataFrames

Complex data types – arrays, maps, and structs

Interacting with data sources

Chapter 12: Distributed Machine Learning with MLlib

Introducing MLlib – Spam classification

Regularization in logistic regression

Cross-validation and model selection

Beyond logistic regression

Chapter 13: Web APIs with Play

Client-server applications

Introduction to web frameworks

Model-View-Controller architecture

Querying external APIs and consuming JSON

Creating APIs with Play: a summary

Chapter 14: Visualization with D3 and the Play Framework

JavaScript dependencies through web-jars

Towards a web application: HTML templates

Modular JavaScript through RequireJS

Bootstrapping the applications

Client-side program architecture

Chapter 15: Getting Started

Mathematical notation for the curious

Taxonomy of machine learning algorithms

Don't reinvent the wheel!

Chapter 16: Hello World!

Monadic data transformation

A workflow computational model

Chapter 17: Data Preprocessing

The discrete Kalman filter

Alternative preprocessing techniques

Chapter 18: Unsupervised Learning

Performance considerations

Chapter 19: Naïve Bayes Classifiers

Probabilistic graphical models

The Multivariate Bernoulli classification

Naïve Bayes and text mining

Chapter 20: Regression and Regularization

Chapter 21: Sequential Data Models

Markov decision processes

Conditional random fields

Regularized CRFs and text analytics

Performance consideration

Chapter 22: Kernel Models and Support Vector Machines

Support vector classifiers – SVC

Anomaly detection with one-class SVC

Support vector regression

Performance considerations

Chapter 23: Artificial Neural Networks

Feed-forward neural networks

The multilayer perceptron

Convolution neural networks

Chapter 24: Genetic Algorithms

Genetic algorithms and machine learning

Genetic algorithm components

GA for trading strategies

Advantages and risks of genetic algorithms

Chapter 25: Reinforcement Learning

Learning classifier systems

Chapter 26: Scalable Frameworks

Chapter 27: Exploratory Data Analysis

Getting started with Scala

Distinct values of a categorical field

Summarization of a numeric field

Basic, stratified, and consistent sampling

Working with Scala and Spark Notebooks

Chapter 28: Data Pipelines and Modeling

Sequential trials and dealing with risk

Exploration and exploitation

Basic components of a data-driven system

Optimization and interactivity

Chapter 29: Working with Spark and MLlib

Understanding Spark architecture

Chapter 30: Supervised and Unsupervised Learning

Records and supervised learning

Chapter 31: Regression and Classification

What regression stands for?

Continuous space and metrics

Generalization error and overfitting

Chapter 32: Working with Unstructured Data

Other serialization formats

Working with pattern matching

Other uses of unstructured data

Chapter 33: Working with Graph Algorithms

A quick introduction to graphs

Chapter 34: Integrating Scala with R and Python

Chapter 35: NLP in Scala

MLlib algorithms in Spark

Segmentation, annotation, and chunking

Using word2vec to find word relationships

Chapter 36: Advanced Model Monitoring