Advanced Data Analysis with Haskell [Video]

Advanced Data Analysis with Haskell [Video]

James Church

Learn advanced data analysis techniques to gain insights into real-world data sets using Haskell
Mapt Subscription
FREE
$29.99/m after trial
Video
$37.50
RRP $124.99
What do I get with a Mapt Pro subscription?
  • Unlimited access to all Packt’s 5,000+ eBooks and Videos
  • Early Access content, Progress Tracking, and Assessments
  • 1 Free eBook or Video to download and keep every month after trial
What do I get with an eBook?
  • Download this book in EPUB, PDF, MOBI formats
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
What do I get with Print & eBook?
  • Get a paperback copy of the book delivered to you
  • Download this book in EPUB, PDF, MOBI formats
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
What do I get with a Video?
  • Download this Video course in MP4 format
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
$0.00
$37.50
$29.99p/m after trial
RRP $124.99
Subscription
Video
Start 30 Day Trial
Subscribe and access every Packt eBook & Video.
 
  • 5,000+ eBooks & Videos
  • 50+ New titles a month
  • 1 Free eBook/Video to keep every month
Start Free Trial
 
Preview in Mapt

Video Details

ISBN 139781785287237
Course Length4 hours 4 minutes

Video Description

Every business and organization that collects data is capable of tapping into its own data to gain insights on how to improve. Haskell is a purely functional and lazy programming language that is well suited to handling large data analysis problems. This video picks up where Beginning Haskell Data Analysis takes off. This video series will take you through the more difficult problems of data analysis in a conversational style.

You will be guided on how to find correlations in data, as well as multiple dependent variables. You will be given a theoretical overview of the types of regression and we’ll show you how to install the LAPACK and HMatrix libraries. By the end of the first part, you’ll be familiar with the application of N-grams and TF-IDF.

Once you’ve learned how to analyze data, the next step is organizing that data with the help of machine learning algorithms. You will be briefed on the mathematics and statistical theorems such as Baye’s law and its application, as well as eigenvalues and eigenvectors using HMatrix.

By the end of this course, you’ll have an understanding of data analysis, different ways to analyze data, and the various clustering algorithms available. You’ll also understand Haskell and will be ready to write code with it.

Style and Approach

Each video guides you through the journey of a problem, a mathematical definition of the problem, an algorithmic approach to solving the problem, and the detailed Haskell approach to solving that problem. Each video builds a little on the video before it and at a conversational pace. We use the Jupyter notebook system that allows us to easily create and share notebooks of our analysis work. You can download the same notebooks that were created in each of our videos.

Table of Contents

Brushing up on the Basics
The Course Overview
CSV Files to SQLite3
Regular Expressions
Visualizations
Kernel Density Estimation
Regression Analysis
Linear Regression
Correlation Coefficients
Drawbacks of Linear Regression
Logarithmic Regression
Polynomial Regression
Multiple Regression
Creating Matrices in HMatrix
Performing Multivariate Regression
Calculating the Adjusted R^2
Improving the Adjusted R^2 Score
Text Analysis
Preparing Our Text
Finding the Set of N-Grams
Cosine Similarity
Overview of TF-IDF
Applying TF-IDF
Clustering
Clustering: An Overview
Random Cluster Generation
Distances between Clusters
Performing K-Means Clustering
Performing Hierarchical Clustering
Naïve Bayes Classification
Bayes: A Discussion
Bayes: The Code
Bayes on Full Documents
Principal Component Analysis
PCA: A Discussion
Preparing Our Dataset
Eigendecomposition
Dimensionality Reduction
Recommendation Engine

What You Will Learn

  • Get to know the basics of data analysis: SQLite3 basics, regular expression, and visualization
  • Understand the process involved in linear regression and the pitfalls of it
  • Study a corpus of text to discover interesting features using TF-IDF analysis
  • Determine the likelihood of an event using Naïve Bayesian Classification
  • Reduce the size of data without affecting the data’s effectiveness using Principal Component Analysis
  • Generate Eigenvalues and Eigenvectors using HMatrix
  • Untangle the different varieties of clusters
  • Master the techniques necessary to perform multivariate regression using Haskell code

Authors

Table of Contents

Brushing up on the Basics
The Course Overview
CSV Files to SQLite3
Regular Expressions
Visualizations
Kernel Density Estimation
Regression Analysis
Linear Regression
Correlation Coefficients
Drawbacks of Linear Regression
Logarithmic Regression
Polynomial Regression
Multiple Regression
Creating Matrices in HMatrix
Performing Multivariate Regression
Calculating the Adjusted R^2
Improving the Adjusted R^2 Score
Text Analysis
Preparing Our Text
Finding the Set of N-Grams
Cosine Similarity
Overview of TF-IDF
Applying TF-IDF
Clustering
Clustering: An Overview
Random Cluster Generation
Distances between Clusters
Performing K-Means Clustering
Performing Hierarchical Clustering
Naïve Bayes Classification
Bayes: A Discussion
Bayes: The Code
Bayes on Full Documents
Principal Component Analysis
PCA: A Discussion
Preparing Our Dataset
Eigendecomposition
Dimensionality Reduction
Recommendation Engine

Video Details

ISBN 139781785287237
Course Length4 hours 4 minutes
Read More

Read More Reviews

Recommended for You