R: Recipes for Analysis, Visualization and Machine Learning

Get savvy with R language and actualize projects aimed at analysis, visualization and machine learning
Preview in Mapt

R: Recipes for Analysis, Visualization and Machine Learning

Viswa Viswanathan et al.

4 customer reviews
Get savvy with R language and actualize projects aimed at analysis, visualization and machine learning

Quick links: > What will you learn?> Table of content> Product reviews

Mapt Subscription
FREE
$29.99/m after trial
eBook
$50.40
RRP $71.99
Save 29%
Print + eBook
$89.99
RRP $89.99
What do I get with a Mapt Pro subscription?
  • Unlimited access to all Packt’s 5,000+ eBooks and Videos
  • Early Access content, Progress Tracking, and Assessments
  • 1 Free eBook or Video to download and keep every month after trial
What do I get with an eBook?
  • Download this book in EPUB, PDF, MOBI formats
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
What do I get with Print & eBook?
  • Get a paperback copy of the book delivered to you
  • Download this book in EPUB, PDF, MOBI formats
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
What do I get with a Video?
  • Download this Video course in MP4 format
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
$0.00
$50.40
$89.99
$29.99 p/m after trial
RRP $71.99
RRP $89.99
Subscription
eBook
Print + eBook
Start 14 Day Trial

Frequently bought together


R: Recipes for Analysis, Visualization and Machine Learning Book Cover
R: Recipes for Analysis, Visualization and Machine Learning
$ 71.99
$ 50.40
R: Mining spatial, text, web, and social media data Book Cover
R: Mining spatial, text, web, and social media data
$ 63.99
$ 44.80
Buy 2 for $35.00
Save $100.98
Add to Cart

Book Details

ISBN 139781787289598
Paperback976 pages

Book Description

The R language is a powerful, open source, functional programming language. At its core, R is a statistical programming language that provides impressive tools to analyze data and create high-level graphics. This Learning Path is chock-full of recipes. Literally! It aims to excite you with awesome projects focused on analysis, visualization, and machine learning. We’ll start off with data analysis – this will show you ways to use R to generate professional analysis reports. We’ll then move on to visualizing our data – this provides you with all the guidance needed to get comfortable with data visualization with R. Finally, we’ll move into the world of machine learning – this introduces you to data classification, regression, clustering, association rule mining, and dimension reduction.

This Learning Path combines some of the best that Packt has to offer in one complete, curated package. It includes content from the following Packt products:

Table of Contents

Chapter 1: A Simple Guide to R
Installing packages and getting help in R
Data types in R
Special values in R
Matrices in R
Editing a matrix in R
Data frames in R
Editing a data frame in R
Importing data in R
Exporting data in R
Writing a function in R
Writing if else statements in R
Basic loops in R
Nested loops in R
The apply, lapply, sapply, and tapply functions
Using par to beautify a plot in R
Saving plots
Chapter 2: Practical Machine Learning with R
Introduction
Downloading and installing R
Downloading and installing RStudio
Installing and loading packages
Reading and writing data
Using R to manipulate data
Applying basic statistics
Visualizing data
Getting a dataset for machine learning
Chapter 3: Acquire and Prepare the Ingredients – Your Data
Introduction
Reading data from CSV files
Reading XML data
Reading JSON data
Reading data from fixed-width formatted files
Reading data from R files and R libraries
Removing cases with missing values
Replacing missing values with the mean
Removing duplicate cases
Rescaling a variable to [0,1]
Normalizing or standardizing data in a data frame
Binning numerical data
Creating dummies for categorical variables
Chapter 4: What's in There? – Exploratory Data Analysis
Introduction
Creating standard data summaries
Extracting a subset of a dataset
Splitting a dataset
Creating random data partitions
Generating standard plots such as histograms, boxplots, and scatterplots
Generating multiple plots on a grid
Selecting a graphics device
Creating plots with the lattice package
Creating plots with the ggplot2 package
Creating charts that facilitate comparisons
Creating charts that help visualize a possible causality
Creating multivariate plots
Chapter 5: Where Does It Belong? – Classification
Introduction
Generating error/classification-confusion matrices
Generating ROC charts
Building, plotting, and evaluating – classification trees
Using random forest models for classification
Classifying using Support Vector Machine
Classifying using the Naïve Bayes approach
Classifying using the KNN approach
Using neural networks for classification
Classifying using linear discriminant function analysis
Classifying using logistic regression
Using AdaBoost to combine classification tree models
Chapter 6: Give Me a Number – Regression
Introduction
Computing the root mean squared error
Building KNN models for regression
Performing linear regression
Performing variable selection in linear regression
Building regression trees
Building random forest models for regression
Using neural networks for regression
Performing k-fold cross-validation
Performing leave-one-out-cross-validation to limit overfitting
Chapter 7: Can You Simplify That? – Data Reduction Techniques
Introduction
Performing cluster analysis using K-means clustering
Performing cluster analysis using hierarchical clustering
Reducing dimensionality with principal component analysis
Chapter 8: Lessons from History – Time Series Analysis
Introduction
Creating and examining date objects
Operating on date objects
Performing preliminary analyses on time series data
Using time series objects
Decomposing time series
Filtering time series data
Smoothing and forecasting using the Holt-Winters method
Building an automated ARIMA model
Chapter 9: It's All About Your Connections – Social Network Analysis
Introduction
Downloading social network data using public APIs
Creating adjacency matrices and edge lists
Plotting social network data
Computing important network metrics
Chapter 10: Put Your Best Foot Forward – Document and Present Your Analysis
Introduction
Generating reports of your data analysis with R Markdown and knitr
Creating interactive web applications with shiny
Creating PDF presentations of your analysis with R Presentation
Chapter 11: Work Smarter, Not Harder – Efficient and Elegant R Code
Introduction
Exploiting vectorized operations
Processing entire rows or columns using the apply function
Applying a function to all elements of a collection with lapply and sapply
Applying functions to subsets of a vector
Using the split-apply-combine strategy with plyr
Slicing, dicing, and combining data with data tables
Chapter 12: Where in the World? – Geospatial Analysis
Introduction
Downloading and plotting a Google map of an area
Overlaying data on the downloaded Google map
Importing ESRI shape files into R
Using the sp package to plot geographic data
Getting maps from the maps package
Creating spatial data frames from regular data frames containing spatial and other data
Creating spatial data frames by combining regular data frames with spatial objects
Adding variables to an existing spatial data frame
Chapter 13: Playing Nice – Connecting to Other Systems
Introduction
Using Java objects in R
Using JRI to call R functions from Java
Using Rserve to call R functions from Java
Executing R scripts from Java
Using the xlsx package to connect to Excel
Reading data from relational databases – MySQL
Reading data from NoSQL databases – MongoDB
Chapter 14: Basic and Interactive Plots
Introduction
Introducing a scatter plot
Scatter plots with texts, labels, and lines
Connecting points in a scatter plot
Generating an interactive scatter plot
A simple bar plot
An interactive bar plot
A simple line plot
Line plot to tell an effective story
Generating an interactive Gantt/timeline chart in R
Merging histograms
Making an interactive bubble plot
Constructing a waterfall plot in R
Chapter 15: Heat Maps and Dendrograms
Introduction
Constructing a simple dendrogram
Creating dendrograms with colors and labels
Creating a heat map
Generating a heat map with customized colors
Generating an integrated dendrogram and a heat map
Creating a three-dimensional heat map and a stereo map
Constructing a tree map in R
Chapter 16: Maps
Introduction
Introducing regional maps
Introducing choropleth maps
A guide to contour maps
Constructing maps with bubbles
Integrating text with maps
Introducing shapefiles
Creating cartograms
Chapter 17: The Pie Chart and Its Alternatives
Introduction
Generating a simple pie chart
Constructing pie charts with labels
Creating donut plots and interactive plots
Generating a slope chart
Constructing a fan plot
Chapter 18: Adding the Third Dimension
Introduction
Constructing a 3D scatter plot
Generating a 3D scatter plot with text
A simple 3D pie chart
A simple 3D histogram
Generating a 3D contour plot
Integrating a 3D contour and a surface plot
Animating a 3D surface plot
Chapter 19: Data in Higher Dimensions
Introduction
Constructing a sunflower plot
Creating a hexbin plot
Generating interactive calendar maps
Creating Chernoff faces in R
Constructing a coxcomb plot in R
Constructing network plots
Constructing a radial plot
Generating a very basic pyramid plot
Chapter 20: Visualizing Continuous Data
Introduction
Generating a candlestick plot
Generating interactive candlestick plots
Generating a decomposed time series
Plotting a regression line
Constructing a box and whiskers plot
Generating a violin plot
Generating a quantile-quantile plot (QQ plot)
Generating a density plot
Generating a simple correlation plot
Chapter 21: Visualizing Text and XKCD-style Plots
Introduction
Generating a word cloud
Constructing a word cloud from a document
Generating a comparison cloud
Constructing a correlation plot and a phrase tree
Generating plots with custom fonts
Generating an XKCD-style plot
Chapter 22: Creating Applications in R
Introduction
Creating animated plots in R
Creating a presentation in R
A basic introduction to API and XML
Constructing a bar plot using XML in R
Creating a very simple shiny app in R
Chapter 23: Data Exploration with RMS Titanic
Introduction
Reading a Titanic dataset from a CSV file
Converting types on character variables
Detecting missing values
Imputing missing values
Exploring and visualizing data
Predicting passenger survival with a decision tree
Validating the power of prediction with a confusion matrix
Assessing performance with the ROC curve
Chapter 24: R and Statistics
Introduction
Understanding data sampling in R
Operating a probability distribution in R
Working with univariate descriptive statistics in R
Performing correlations and multivariate analysis
Operating linear regression and multivariate analysis
Conducting an exact binomial test
Performing student's t-test
Performing the Kolmogorov-Smirnov test
Understanding the Wilcoxon Rank Sum and Signed Rank test
Working with Pearson's Chi-squared test
Conducting a one-way ANOVA
Performing a two-way ANOVA
Chapter 25: Understanding Regression Analysis
Introduction
Fitting a linear regression model with lm
Summarizing linear model fits
Using linear regression to predict unknown values
Generating a diagnostic plot of a fitted model
Fitting a polynomial regression model with lm
Fitting a robust linear regression model with rlm
Studying a case of linear regression on SLID data
Applying the Gaussian model for generalized linear regression
Applying the Poisson model for generalized linear regression
Applying the Binomial model for generalized linear regression
Fitting a generalized additive model to data
Visualizing a generalized additive model
Diagnosing a generalized additive model
Chapter 26: Classification (I) – Tree, Lazy, and Probabilistic
Introduction
Preparing the training and testing datasets
Building a classification model with recursive partitioning trees
Visualizing a recursive partitioning tree
Measuring the prediction performance of a recursive partitioning tree
Pruning a recursive partitioning tree
Building a classification model with a conditional inference tree
Visualizing a conditional inference tree
Measuring the prediction performance of a conditional inference tree
Classifying data with the k-nearest neighbor classifier
Classifying data with logistic regression
Classifying data with the Naïve Bayes classifier
Chapter 27: Classification (II) – Neural Network and SVM
Introduction
Classifying data with a support vector machine
Choosing the cost of a support vector machine
Visualizing an SVM fit
Predicting labels based on a model trained by a support vector machine
Tuning a support vector machine
Training a neural network with neuralnet
Visualizing a neural network trained by neuralnet
Predicting labels based on a model trained by neuralnet
Training a neural network with nnet
Predicting labels based on a model trained by nnet
Chapter 28: Model Evaluation
Introduction
Estimating model performance with k-fold cross-validation
Performing cross-validation with the e1071 package
Performing cross-validation with the caret package
Ranking the variable importance with the caret package
Ranking the variable importance with the rminer package
Finding highly correlated features with the caret package
Selecting features using the caret package
Measuring the performance of the regression model
Measuring prediction performance with a confusion matrix
Measuring prediction performance using ROCR
Comparing an ROC curve using the caret package
Measuring performance differences between models with the caret package
Chapter 29: Ensemble Learning
Introduction
Classifying data with the bagging method
Performing cross-validation with the bagging method
Classifying data with the boosting method
Performing cross-validation with the boosting method
Classifying data with gradient boosting
Calculating the margins of a classifier
Calculating the error evolution of the ensemble method
Classifying data with random forest
Estimating the prediction errors of different classifiers
Chapter 30: Clustering
Introduction
Clustering data with hierarchical clustering
Cutting trees into clusters
Clustering data with the k-means method
Drawing a bivariate cluster plot
Comparing clustering methods
Extracting silhouette information from clustering
Obtaining the optimum number of clusters for k-means
Clustering data with the density-based method
Clustering data with the model-based method
Visualizing a dissimilarity matrix
Validating clusters externally
Chapter 31: Association Analysis and Sequence Mining
Introduction
Transforming data into transactions
Displaying transactions and associations
Mining associations with the Apriori rule
Pruning redundant rules
Visualizing association rules
Mining frequent itemsets with Eclat
Creating transactions with temporal information
Mining frequent sequential patterns with cSPADE
Chapter 32: Dimension Reduction
Introduction
Performing feature selection with FSelector
Performing dimension reduction with PCA
Determining the number of principal components using the scree test
Determining the number of principal components using the Kaiser method
Visualizing multivariate data using biplot
Performing dimension reduction with MDS
Reducing dimensions with SVD
Compressing images with SVD
Performing nonlinear dimension reduction with ISOMAP
Performing nonlinear dimension reduction with Local Linear Embedding
Chapter 33: Big Data Analysis (R and Hadoop)
Introduction
Preparing the RHadoop environment
Installing rmr2
Installing rhdfs
Operating HDFS with rhdfs
Implementing a word count problem with RHadoop
Comparing the performance between an R MapReduce program and a standard R program
Testing and debugging the rmr2 program
Installing plyrmr
Manipulating data with plyrmr
Conducting machine learning with RHadoop
Configuring RHadoop clusters on Amazon EMR

What You Will Learn

  • Get data into your R environment and prepare it for analysis
  • Perform exploratory data analyses and generate meaningful visualizations of the data
  • Generate various plots in R using the basic R plotting techniques
  • Create presentations and learn the basics of creating apps in R for your audience
  • Create and inspect the transaction dataset, performing association analysis with the Apriori algorithm
  • Visualize associations in various graph formats and find frequent itemset using the ECLAT algorithm
  • Build, tune, and evaluate predictive models with different machine learning packages
  • Incorporate R and Hadoop to solve machine learning problems on big data

Authors

Table of Contents

Chapter 1: A Simple Guide to R
Installing packages and getting help in R
Data types in R
Special values in R
Matrices in R
Editing a matrix in R
Data frames in R
Editing a data frame in R
Importing data in R
Exporting data in R
Writing a function in R
Writing if else statements in R
Basic loops in R
Nested loops in R
The apply, lapply, sapply, and tapply functions
Using par to beautify a plot in R
Saving plots
Chapter 2: Practical Machine Learning with R
Introduction
Downloading and installing R
Downloading and installing RStudio
Installing and loading packages
Reading and writing data
Using R to manipulate data
Applying basic statistics
Visualizing data
Getting a dataset for machine learning
Chapter 3: Acquire and Prepare the Ingredients – Your Data
Introduction
Reading data from CSV files
Reading XML data
Reading JSON data
Reading data from fixed-width formatted files
Reading data from R files and R libraries
Removing cases with missing values
Replacing missing values with the mean
Removing duplicate cases
Rescaling a variable to [0,1]
Normalizing or standardizing data in a data frame
Binning numerical data
Creating dummies for categorical variables
Chapter 4: What's in There? – Exploratory Data Analysis
Introduction
Creating standard data summaries
Extracting a subset of a dataset
Splitting a dataset
Creating random data partitions
Generating standard plots such as histograms, boxplots, and scatterplots
Generating multiple plots on a grid
Selecting a graphics device
Creating plots with the lattice package
Creating plots with the ggplot2 package
Creating charts that facilitate comparisons
Creating charts that help visualize a possible causality
Creating multivariate plots
Chapter 5: Where Does It Belong? – Classification
Introduction
Generating error/classification-confusion matrices
Generating ROC charts
Building, plotting, and evaluating – classification trees
Using random forest models for classification
Classifying using Support Vector Machine
Classifying using the Naïve Bayes approach
Classifying using the KNN approach
Using neural networks for classification
Classifying using linear discriminant function analysis
Classifying using logistic regression
Using AdaBoost to combine classification tree models
Chapter 6: Give Me a Number – Regression
Introduction
Computing the root mean squared error
Building KNN models for regression
Performing linear regression
Performing variable selection in linear regression
Building regression trees
Building random forest models for regression
Using neural networks for regression
Performing k-fold cross-validation
Performing leave-one-out-cross-validation to limit overfitting
Chapter 7: Can You Simplify That? – Data Reduction Techniques
Introduction
Performing cluster analysis using K-means clustering
Performing cluster analysis using hierarchical clustering
Reducing dimensionality with principal component analysis
Chapter 8: Lessons from History – Time Series Analysis
Introduction
Creating and examining date objects
Operating on date objects
Performing preliminary analyses on time series data
Using time series objects
Decomposing time series
Filtering time series data
Smoothing and forecasting using the Holt-Winters method
Building an automated ARIMA model
Chapter 9: It's All About Your Connections – Social Network Analysis
Introduction
Downloading social network data using public APIs
Creating adjacency matrices and edge lists
Plotting social network data
Computing important network metrics
Chapter 10: Put Your Best Foot Forward – Document and Present Your Analysis
Introduction
Generating reports of your data analysis with R Markdown and knitr
Creating interactive web applications with shiny
Creating PDF presentations of your analysis with R Presentation
Chapter 11: Work Smarter, Not Harder – Efficient and Elegant R Code
Introduction
Exploiting vectorized operations
Processing entire rows or columns using the apply function
Applying a function to all elements of a collection with lapply and sapply
Applying functions to subsets of a vector
Using the split-apply-combine strategy with plyr
Slicing, dicing, and combining data with data tables
Chapter 12: Where in the World? – Geospatial Analysis
Introduction
Downloading and plotting a Google map of an area
Overlaying data on the downloaded Google map
Importing ESRI shape files into R
Using the sp package to plot geographic data
Getting maps from the maps package
Creating spatial data frames from regular data frames containing spatial and other data
Creating spatial data frames by combining regular data frames with spatial objects
Adding variables to an existing spatial data frame
Chapter 13: Playing Nice – Connecting to Other Systems
Introduction
Using Java objects in R
Using JRI to call R functions from Java
Using Rserve to call R functions from Java
Executing R scripts from Java
Using the xlsx package to connect to Excel
Reading data from relational databases – MySQL
Reading data from NoSQL databases – MongoDB
Chapter 14: Basic and Interactive Plots
Introduction
Introducing a scatter plot
Scatter plots with texts, labels, and lines
Connecting points in a scatter plot
Generating an interactive scatter plot
A simple bar plot
An interactive bar plot
A simple line plot
Line plot to tell an effective story
Generating an interactive Gantt/timeline chart in R
Merging histograms
Making an interactive bubble plot
Constructing a waterfall plot in R
Chapter 15: Heat Maps and Dendrograms
Introduction
Constructing a simple dendrogram
Creating dendrograms with colors and labels
Creating a heat map
Generating a heat map with customized colors
Generating an integrated dendrogram and a heat map
Creating a three-dimensional heat map and a stereo map
Constructing a tree map in R
Chapter 16: Maps
Introduction
Introducing regional maps
Introducing choropleth maps
A guide to contour maps
Constructing maps with bubbles
Integrating text with maps
Introducing shapefiles
Creating cartograms
Chapter 17: The Pie Chart and Its Alternatives
Introduction
Generating a simple pie chart
Constructing pie charts with labels
Creating donut plots and interactive plots
Generating a slope chart
Constructing a fan plot
Chapter 18: Adding the Third Dimension
Introduction
Constructing a 3D scatter plot
Generating a 3D scatter plot with text
A simple 3D pie chart
A simple 3D histogram
Generating a 3D contour plot
Integrating a 3D contour and a surface plot
Animating a 3D surface plot
Chapter 19: Data in Higher Dimensions
Introduction
Constructing a sunflower plot
Creating a hexbin plot
Generating interactive calendar maps
Creating Chernoff faces in R
Constructing a coxcomb plot in R
Constructing network plots
Constructing a radial plot
Generating a very basic pyramid plot
Chapter 20: Visualizing Continuous Data
Introduction
Generating a candlestick plot
Generating interactive candlestick plots
Generating a decomposed time series
Plotting a regression line
Constructing a box and whiskers plot
Generating a violin plot
Generating a quantile-quantile plot (QQ plot)
Generating a density plot
Generating a simple correlation plot
Chapter 21: Visualizing Text and XKCD-style Plots
Introduction
Generating a word cloud
Constructing a word cloud from a document
Generating a comparison cloud
Constructing a correlation plot and a phrase tree
Generating plots with custom fonts
Generating an XKCD-style plot
Chapter 22: Creating Applications in R
Introduction
Creating animated plots in R
Creating a presentation in R
A basic introduction to API and XML
Constructing a bar plot using XML in R
Creating a very simple shiny app in R
Chapter 23: Data Exploration with RMS Titanic
Introduction
Reading a Titanic dataset from a CSV file
Converting types on character variables
Detecting missing values
Imputing missing values
Exploring and visualizing data
Predicting passenger survival with a decision tree
Validating the power of prediction with a confusion matrix
Assessing performance with the ROC curve
Chapter 24: R and Statistics
Introduction
Understanding data sampling in R
Operating a probability distribution in R
Working with univariate descriptive statistics in R
Performing correlations and multivariate analysis
Operating linear regression and multivariate analysis
Conducting an exact binomial test
Performing student's t-test
Performing the Kolmogorov-Smirnov test
Understanding the Wilcoxon Rank Sum and Signed Rank test
Working with Pearson's Chi-squared test
Conducting a one-way ANOVA
Performing a two-way ANOVA
Chapter 25: Understanding Regression Analysis
Introduction
Fitting a linear regression model with lm
Summarizing linear model fits
Using linear regression to predict unknown values
Generating a diagnostic plot of a fitted model
Fitting a polynomial regression model with lm
Fitting a robust linear regression model with rlm
Studying a case of linear regression on SLID data
Applying the Gaussian model for generalized linear regression
Applying the Poisson model for generalized linear regression
Applying the Binomial model for generalized linear regression
Fitting a generalized additive model to data
Visualizing a generalized additive model
Diagnosing a generalized additive model
Chapter 26: Classification (I) – Tree, Lazy, and Probabilistic
Introduction
Preparing the training and testing datasets
Building a classification model with recursive partitioning trees
Visualizing a recursive partitioning tree
Measuring the prediction performance of a recursive partitioning tree
Pruning a recursive partitioning tree
Building a classification model with a conditional inference tree
Visualizing a conditional inference tree
Measuring the prediction performance of a conditional inference tree
Classifying data with the k-nearest neighbor classifier
Classifying data with logistic regression
Classifying data with the Naïve Bayes classifier
Chapter 27: Classification (II) – Neural Network and SVM
Introduction
Classifying data with a support vector machine
Choosing the cost of a support vector machine
Visualizing an SVM fit
Predicting labels based on a model trained by a support vector machine
Tuning a support vector machine
Training a neural network with neuralnet
Visualizing a neural network trained by neuralnet
Predicting labels based on a model trained by neuralnet
Training a neural network with nnet
Predicting labels based on a model trained by nnet
Chapter 28: Model Evaluation
Introduction
Estimating model performance with k-fold cross-validation
Performing cross-validation with the e1071 package
Performing cross-validation with the caret package
Ranking the variable importance with the caret package
Ranking the variable importance with the rminer package
Finding highly correlated features with the caret package
Selecting features using the caret package
Measuring the performance of the regression model
Measuring prediction performance with a confusion matrix
Measuring prediction performance using ROCR
Comparing an ROC curve using the caret package
Measuring performance differences between models with the caret package
Chapter 29: Ensemble Learning
Introduction
Classifying data with the bagging method
Performing cross-validation with the bagging method
Classifying data with the boosting method
Performing cross-validation with the boosting method
Classifying data with gradient boosting
Calculating the margins of a classifier
Calculating the error evolution of the ensemble method
Classifying data with random forest
Estimating the prediction errors of different classifiers
Chapter 30: Clustering
Introduction
Clustering data with hierarchical clustering
Cutting trees into clusters
Clustering data with the k-means method
Drawing a bivariate cluster plot
Comparing clustering methods
Extracting silhouette information from clustering
Obtaining the optimum number of clusters for k-means
Clustering data with the density-based method
Clustering data with the model-based method
Visualizing a dissimilarity matrix
Validating clusters externally
Chapter 31: Association Analysis and Sequence Mining
Introduction
Transforming data into transactions
Displaying transactions and associations
Mining associations with the Apriori rule
Pruning redundant rules
Visualizing association rules
Mining frequent itemsets with Eclat
Creating transactions with temporal information
Mining frequent sequential patterns with cSPADE
Chapter 32: Dimension Reduction
Introduction
Performing feature selection with FSelector
Performing dimension reduction with PCA
Determining the number of principal components using the scree test
Determining the number of principal components using the Kaiser method
Visualizing multivariate data using biplot
Performing dimension reduction with MDS
Reducing dimensions with SVD
Compressing images with SVD
Performing nonlinear dimension reduction with ISOMAP
Performing nonlinear dimension reduction with Local Linear Embedding
Chapter 33: Big Data Analysis (R and Hadoop)
Introduction
Preparing the RHadoop environment
Installing rmr2
Installing rhdfs
Operating HDFS with rhdfs
Implementing a word count problem with RHadoop
Comparing the performance between an R MapReduce program and a standard R program
Testing and debugging the rmr2 program
Installing plyrmr
Manipulating data with plyrmr
Conducting machine learning with RHadoop
Configuring RHadoop clusters on Amazon EMR

Book Details

ISBN 139781787289598
Paperback976 pages
Read More
From 4 reviews

Read More Reviews

Recommended for You

R: Mining spatial, text, web, and social media data Book Cover
R: Mining spatial, text, web, and social media data
$ 63.99
$ 44.80
R: Predictive Analysis Book Cover
R: Predictive Analysis
$ 71.99
$ 50.40
Statistics for Machine Learning Book Cover
Statistics for Machine Learning
$ 39.99
$ 28.00
Python: End-to-end Data Analysis Book Cover
Python: End-to-end Data Analysis
$ 71.99
$ 50.40
Statistical Analysis with R Book Cover
Statistical Analysis with R
$ 26.99
$ 18.90
Python: Data Analytics and Visualization Book Cover
Python: Data Analytics and Visualization
$ 79.99
$ 56.00