# Advanced Data Analysis with Haskell [Video]

 Learn Get to know the basics of data analysis: SQLite3 basics, regular expression, and visualization Understand the process involved in linear regression and the pitfalls of it Study a corpus of text to discover interesting features using TF-IDF analysis Determine the likelihood of an event using Naïve Bayesian Classification Reduce the size of data without affecting the data’s effectiveness using Principal Component Analysis Generate Eigenvalues and Eigenvectors using HMatrix Untangle the different varieties of clusters Master the techniques necessary to perform multivariate regression using Haskell code Every business and organization that collects data is capable of tapping into its own data to gain insights on how to improve. Haskell is a purely functional and lazy programming language that is well suited to handling large data analysis problems. This video picks up where Beginning Haskell Data Analysis takes off. This video series will take you through the more difficult problems of data analysis in a conversational style. You will be guided on how to find correlations in data, as well as multiple dependent variables. You will be given a theoretical overview of the types of regression and we’ll show you how to install the LAPACK and HMatrix libraries. By the end of the first part, you’ll be familiar with the application of N-grams and TF-IDF. Once you’ve learned how to analyze data, the next step is organizing that data with the help of machine learning algorithms. You will be briefed on the mathematics and statistical theorems such as Baye’s law and its application, as well as eigenvalues and eigenvectors using HMatrix. By the end of this course, you’ll have an understanding of data analysis, different ways to analyze data, and the various clustering algorithms available. You’ll also understand Haskell and will be ready to write code with it. Style and Approach Each video guides you through the journey of a problem, a mathematical definition of the problem, an algorithmic approach to solving the problem, and the detailed Haskell approach to solving that problem. Each video builds a little on the video before it and at a conversational pace. We use the Jupyter notebook system that allows us to easily create and share notebooks of our analysis work. You can download the same notebooks that were created in each of our videos. Visualize and harvest information from data Understand Regression analysis, perform multivariate regression, and untangle different varieties of clusters Explore the power of non-strict semantics, strong static typing, and control constructs and make data analysis simpler 4 hours 4 minutes 9781785287237 26 Dec 2016
 The Course Overview CSV Files to SQLite3 Regular Expressions Visualizations Kernel Density Estimation
 Linear Regression Correlation Coefficients Drawbacks of Linear Regression Logarithmic Regression Polynomial Regression
 Preparing Our Text Finding the Set of N-Grams Cosine Similarity Overview of TF-IDF Applying TF-IDF
 PCA: A Discussion Preparing Our Dataset Eigendecomposition Dimensionality Reduction Recommendation Engine