R Machine Learning solutions [Video]
-
Free ChapterGetting Started with R
-
Data Exploration with RMS Titanic
- Reading a Titanic Dataset from a CSV File
- Converting Types on Character Variables
- Detecting Missing Values
- Imputing Missing Values
- Exploring and Visualizing Datac
- Predicting Passenger Survival with a Decision Tree
- Validating the Power of Prediction with a Confusion Matrix
- Assessing performance with the ROC curve
-
R and Statistics
- Understanding Data Sampling in R
- Operating a Probability Distribution in R
- Working with Univariate Descriptive Statistics in R
- Performing Correlations and Multivariate Analysis
- Operating Linear Regression and Multivariate Analysis
- Conducting an Exact Binomial Test
- Performing Student's t-test
- Performing the Kolmogorov-Smirnov Test
- Understanding the Wilcoxon Rank Sum and Signed Rank Test
- Working with Pearson's Chi-Squared Test
- Conducting a One-Way ANOVA
- Performing a Two-Way ANOVA
-
Understanding Regression Analysis
- Fitting a Linear Regression Model with lm
- Summarizing Linear Model Fits
- Using Linear Regression to Predict Unknown Values
- Generating a Diagnostic Plot of a Fitted Model
- Fitting a Polynomial Regression Model with lm
- Fitting a Robust Linear Regression Model with rlm
- Studying a case of linear regression on SLID data
- Applying the Gaussian Model for Generalized Linear Regression
- Applying the Poisson model for Generalized Linear Regression
- Applying the Binomial Model for Generalized Linear Regression
- Fitting a Generalized Additive Model to Data
- Visualizing a Generalized Additive Model
- Diagnosing a Generalized Additive Model
-
Classification – Tree, Lazy, and Probabilistic
- Preparing the Training and Testing Datasets
- Building a Classification Model with Recursive Partitioning Trees
- Visualizing a Recursive Partitioning Tree
- Measuring the Prediction Performance of a Recursive Partitioning Tree
- Pruning a Recursive Partitioning Tree
- Building a Classification Model with a Conditional Inference Tree
- Visualizing a Conditional Inference Tree
- Measuring the Prediction Performance of a Conditional Inference Tree
- Classifying Data with the K-Nearest Neighbor Classifier
- Classifying Data with Logistic Regression
- Classifying data with the Naïve Bayes Classifier
-
Neural Network and SVM
- Classifying Data with a Support Vector Machine
- Choosing the Cost of an SVM
- Visualizing an SVM Fit
- Predicting Labels Based on a Model Trained by an SVM
- Tuning an SVM
- Training a Neural Network with neuralnet
- Visualizing a Neural Network Trained by neuralnet
- Predicting Labels based on a Model Trained by neuralnet
- Training a Neural Network with nnet
- Predicting labels based on a model trained by nnet
-
Model Evaluation
- Estimating Model Performance with k-fold Cross Validation
- Performing Cross Validation with the e1071 Package
- Performing Cross Validation with the caret Package
- Ranking the Variable Importance with the caret Package
- Ranking the Variable Importance with the rminer Package
- Finding Highly Correlated Features with the caret Package
- Selecting Features Using the Caret Package
- Measuring the Performance of the Regression Model
- Measuring Prediction Performance with a Confusion Matrix
- Measuring Prediction Performance Using ROCR
- Comparing an ROC Curve Using the Caret Package
- Measuring Performance Differences between Models with the caret Package
-
Ensemble Learning
- Classifying Data with the Bagging Method
- Performing Cross Validation with the Bagging Method
- Classifying Data with the Boosting Method
- Performing Cross Validation with the Boosting Method
- Classifying Data with Gradient Boosting
- Calculating the Margins of a Classifier
- Calculating the Error Evolution of the Ensemble Method
- Classifying Data with Random Forest
- Estimating the Prediction Errors of Different Classifiers
-
Clustering
- Clustering Data with Hierarchical Clustering
- Cutting Trees into Clusters
- Clustering Data with the k-Means Method
- Drawing a Bivariate Cluster Plot
- Comparing Clustering Methods
- Extracting Silhouette Information from Clustering
- Obtaining the Optimum Number of Clusters for k-Means
- Clustering Data with the Density-Based Method
- Clustering Data with the Model-Based Method
- Visualizing a Dissimilarity Matrix
- Validating Clusters Externally
-
Association Analysis and Sequence Mining
- Transforming Data into Transactions
- Displaying Transactions and Associations
- Mining Associations with the Apriori Rule
- Pruning Redundant Rules
- Visualizing Association Rules
- Mining Frequent Itemsets with Eclat
- Creating Transactions with Temporal Information
- Mining Frequent Sequential Patterns with cSPADE
-
Dimension Reduction
- Performing Feature Selection with FSelector
- Performing Dimension Reduction with PCA
- Determining the Number of Principal Components Using the Scree Test
- Determining the Number of Principal Components Using the Kaiser Method
- Visualizing Nultivariate Data Using biplot
- Performing Dimension Reduction with MDS
- Reducing Dimensions with SVD
- Compressing Images with SVD
- Performing Nonlinear Dimension Reduction with ISOMAP
- Performing Nonlinear Dimension Reduction with Local Linear Embedding
-
Big Data Analysis with R and Hadoop
- Preparing the RHadoop Environment
- Installing rmr2
- Installing rhdfs
- Operating HDFS with rhdfs
- Implementing a Word Count Problem with RHadoop
- Comparing the Performance between an R MapReduce Program and a Standard R Program
- Testing and Debugging the rmr2 Program
- Installing plyrmr
- Manipulating Data with plyrmr
- Conducting Machine Learning with RHadoop
- Configuring RHadoop Clusters on Amazon EMR
R is a statistical programming language that provides impressive tools to analyze data and create high-level graphics. This video course will take you from very basics of R to creating insightful machine learning models with R. You will start with setting up the environment and then perform data ETL in R.
Data exploration examples are provided that demonstrate how powerful data visualization and machine learning is in discovering hidden relationship. You will then dive into important machine learning topics, including data classification, regression, clustering, association rule mining, and dimensionality reduction.
Style and Approach
This easy-to-follow guide is full of hands-on examples of data analysis with R. Each topic is fully explained beginning with the core concepts, followed by step-by-step, practical examples and concluding with detailed explanations of each concept used.
- Publication date:
- November 2016
- Publisher
- Packt
- Duration
- 8 hours 20 minutes
- ISBN
- 9781787282063