R Data Analysis Cookbook - Second Edition

Over 80 recipes to help you breeze through your data analysis projects using R
Preview in Mapt

R Data Analysis Cookbook - Second Edition

Kuntal Ganguly

Over 80 recipes to help you breeze through your data analysis projects using R
Mapt Subscription
FREE
$29.99/m after trial
eBook
$28.00
RRP $39.99
Save 29%
Print + eBook
$49.99
RRP $49.99
What do I get with a Mapt Pro subscription?
  • Unlimited access to all Packt’s 5,000+ eBooks and Videos
  • Early Access content, Progress Tracking, and Assessments
  • 1 Free eBook or Video to download and keep every month after trial
What do I get with an eBook?
  • Download this book in EPUB, PDF, MOBI formats
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
What do I get with Print & eBook?
  • Get a paperback copy of the book delivered to you
  • Download this book in EPUB, PDF, MOBI formats
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
What do I get with a Video?
  • Download this Video course in MP4 format
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
$0.00
$28.00
$49.99
$29.99p/m after trial
RRP $39.99
RRP $49.99
Subscription
eBook
Print + eBook
Start 30 Day Trial

Frequently bought together


R Data Analysis Cookbook - Second Edition Book Cover
R Data Analysis Cookbook - Second Edition
$ 39.99
$ 28.00
Clojure Data Analysis Cookbook - Second Edition Book Cover
Clojure Data Analysis Cookbook - Second Edition
$ 32.99
$ 23.10
Buy 2 for $35.00
Save $37.98
Add to Cart
Subscribe and access every Packt eBook & Video.
 
  • 5,000+ eBooks & Videos
  • 50+ New titles a month
  • 1 Free eBook/Video to keep every month
Start Free Trial
 

Book Details

ISBN 139781787124479
Paperback560 pages

Book Description

Data analytics with R has emerged as a very important focus for organizations of all kinds. R enables even those with only an intuitive grasp of the underlying concepts, without a deep mathematical background, to unleash powerful and detailed examinations of their data.

This book will show you how you can put your data analysis skills in R to practical use, with recipes catering to the basic as well as advanced data analysis tasks. Right from acquiring your data and preparing it for analysis to the more complex data analysis techniques, the book will show you how you can implement each technique in the best possible manner. You will also visualize your data using the popular R packages like ggplot2 and gain hidden insights from it. Starting with implementing the basic data analysis concepts like handling your data to creating basic plots, you will master the more advanced data analysis techniques like performing cluster analysis, and generating effective analysis reports and visualizations. Throughout the book, you will get to know the common problems and obstacles you might encounter while implementing each of the data analysis techniques in R, with ways to overcoming them in the easiest possible way.

By the end of this book, you will have all the knowledge you need to become an expert in data analysis with R, and put your skills to test in real-world scenarios.

Table of Contents

Chapter 1: Acquire and Prepare the Ingredients - Your Data
Introduction
Working with data
Reading data from CSV files
Reading XML data
Reading JSON data
Reading data from fixed-width formatted files
Reading data from R files and R libraries
Removing cases with missing values
Replacing missing values with the mean
Removing duplicate cases
Rescaling a variable to specified min-max range
Normalizing or standardizing data in a data frame
Binning numerical data
Creating dummies for categorical variables
Handling missing data
Correcting data
Imputing data
Detecting outliers
Chapter 2: What's in There - Exploratory Data Analysis
Introduction
Creating standard data summaries
Extracting a subset of a dataset
Splitting a dataset
Creating random data partitions
Generating standard plots, such as histograms, boxplots, and scatterplots
Generating multiple plots on a grid
Creating plots with the lattice package
Creating charts that facilitate comparisons
Creating charts that help to visualize possible causality
Chapter 3: Where Does It Belong? Classification
Introduction
Generating error/classification confusion matrices
Principal Component Analysis
Generating receiver operating characteristic charts
Building, plotting, and evaluating with classification trees
Using random forest models for classification
Classifying using the support vector machine approach
Classifying using the Naive Bayes approach
Classifying using the KNN approach
Using neural networks for classification
Classifying using linear discriminant function analysis
Classifying using logistic regression
Text classification for sentiment analysis
Chapter 4: Give Me a Number - Regression
Introduction
Computing the root-mean-square error
Building KNN models for regression
Performing linear regression
Performing variable selection in linear regression
Building regression trees
Building random forest models for regression
Using neural networks for regression
Performing k-fold cross-validation
Performing leave-one-out cross-validation to limit overfitting
Chapter 5: Can you Simplify That? Data Reduction Techniques
Introduction
Performing cluster analysis using hierarchical clustering
Performing cluster analysis using partitioning clustering
Image segmentation using mini-batch K-means
Partitioning around medoids
Clustering large application
Performing cluster validation
Performing Advance clustering
Model-based clustering with the EM algorithm
Reducing dimensionality with principal component analysis
Chapter 6: Lessons from History - Time Series Analysis
Introduction
Exploring finance datasets
Creating and examining date objects
Operating on date objects
Performing preliminary analyses on time series data
Using time series objects
Decomposing time series
Filtering time series data
Smoothing and forecasting using the Holt-Winters method
Building an automated ARIMA model
Chapter 7: How does it look? - Advanced data visualization
Introduction
Creating scatter plots
Creating line graphs
Creating bar graphs
Making distributions plots
Creating mosaic graphs
Making treemaps
Plotting a correlations matrix
Creating heatmaps
Plotting network graphs
Labeling and legends
Coloring and themes
Creating multivariate plots
Creating 3D graphs and animation
Selecting a graphics device
Chapter 8: This may also interest you - Building Recommendations
Introduction
Building collaborative filtering systems
Performing content-based systems
Building hybrid systems
Performing similarity measures
Application of ML algorithms - image recognition system
Evaluating models and optimization
A practical example - fraud detection system
Chapter 9: It's All About Your Connections - Social Network Analysis
Introduction
Downloading social network data using public APIs
Creating adjacency matrices and edge lists
Plotting social network data
Computing important network metrics
Cluster analysis
Force layout
YiFan Hu layout
Chapter 10: Put Your Best Foot Forward - Document and Present Your Analysis
Introduction
Generating reports of your data analysis with R Markdown and knitr
Creating interactive web applications with shiny
Creating PDF presentations of your analysis with R presentation
Generating dynamic reports
Chapter 11: Work Smarter, Not Harder - Efficient and Elegant R Code
Introduction
Exploiting vectorized operations
Processing entire rows or columns using the apply function
Applying a function to all elements of a collection with lapply and sapply
Applying functions to subsets of a vector
Using the split-apply-combine strategy with plyr
Slicing, dicing, and combining data with data tables
Chapter 12: Where in the World? Geospatial Analysis
Introduction
Downloading and plotting a Google map of an area
Overlaying data on the downloaded Google map
Importing ESRI shape files to R
Using the sp package to plot geographic data
Getting maps from the maps package
Creating spatial data frames from regular data frames containing spatial and other data
Creating spatial data frames by combining regular data frames with spatial objects
Adding variables to an existing spatial data frame
Spatial data analysis with R and QGIS
Chapter 13: Playing Nice - Connecting to Other Systems
Introduction
Using Java objects in R
Using JRI to call R functions from Java
Using Rserve to call R functions from Java
Executing R scripts from Java
Using the xlsx package to connect to Excel
Reading data from relational databases - MySQL
Reading data from NoSQL databases - MongoDB
Working with in-memory data processing with Apache Spark

What You Will Learn

  • Acquire, format and visualize your data using R
  • Using R to perform an Exploratory data analysis
  • Introduction to machine learning algorithms such as classification and regression
  • Get started with social network analysis
  • Generate dynamic reporting with Shiny
  • Get started with geospatial analysis
  • Handling large data with R using Spark and MongoDB
  • Build Recommendation system- Collaborative Filtering, Content based and Hybrid
  • Learn real world dataset examples- Fraud Detection and Image Recognition

Authors

Table of Contents

Chapter 1: Acquire and Prepare the Ingredients - Your Data
Introduction
Working with data
Reading data from CSV files
Reading XML data
Reading JSON data
Reading data from fixed-width formatted files
Reading data from R files and R libraries
Removing cases with missing values
Replacing missing values with the mean
Removing duplicate cases
Rescaling a variable to specified min-max range
Normalizing or standardizing data in a data frame
Binning numerical data
Creating dummies for categorical variables
Handling missing data
Correcting data
Imputing data
Detecting outliers
Chapter 2: What's in There - Exploratory Data Analysis
Introduction
Creating standard data summaries
Extracting a subset of a dataset
Splitting a dataset
Creating random data partitions
Generating standard plots, such as histograms, boxplots, and scatterplots
Generating multiple plots on a grid
Creating plots with the lattice package
Creating charts that facilitate comparisons
Creating charts that help to visualize possible causality
Chapter 3: Where Does It Belong? Classification
Introduction
Generating error/classification confusion matrices
Principal Component Analysis
Generating receiver operating characteristic charts
Building, plotting, and evaluating with classification trees
Using random forest models for classification
Classifying using the support vector machine approach
Classifying using the Naive Bayes approach
Classifying using the KNN approach
Using neural networks for classification
Classifying using linear discriminant function analysis
Classifying using logistic regression
Text classification for sentiment analysis
Chapter 4: Give Me a Number - Regression
Introduction
Computing the root-mean-square error
Building KNN models for regression
Performing linear regression
Performing variable selection in linear regression
Building regression trees
Building random forest models for regression
Using neural networks for regression
Performing k-fold cross-validation
Performing leave-one-out cross-validation to limit overfitting
Chapter 5: Can you Simplify That? Data Reduction Techniques
Introduction
Performing cluster analysis using hierarchical clustering
Performing cluster analysis using partitioning clustering
Image segmentation using mini-batch K-means
Partitioning around medoids
Clustering large application
Performing cluster validation
Performing Advance clustering
Model-based clustering with the EM algorithm
Reducing dimensionality with principal component analysis
Chapter 6: Lessons from History - Time Series Analysis
Introduction
Exploring finance datasets
Creating and examining date objects
Operating on date objects
Performing preliminary analyses on time series data
Using time series objects
Decomposing time series
Filtering time series data
Smoothing and forecasting using the Holt-Winters method
Building an automated ARIMA model
Chapter 7: How does it look? - Advanced data visualization
Introduction
Creating scatter plots
Creating line graphs
Creating bar graphs
Making distributions plots
Creating mosaic graphs
Making treemaps
Plotting a correlations matrix
Creating heatmaps
Plotting network graphs
Labeling and legends
Coloring and themes
Creating multivariate plots
Creating 3D graphs and animation
Selecting a graphics device
Chapter 8: This may also interest you - Building Recommendations
Introduction
Building collaborative filtering systems
Performing content-based systems
Building hybrid systems
Performing similarity measures
Application of ML algorithms - image recognition system
Evaluating models and optimization
A practical example - fraud detection system
Chapter 9: It's All About Your Connections - Social Network Analysis
Introduction
Downloading social network data using public APIs
Creating adjacency matrices and edge lists
Plotting social network data
Computing important network metrics
Cluster analysis
Force layout
YiFan Hu layout
Chapter 10: Put Your Best Foot Forward - Document and Present Your Analysis
Introduction
Generating reports of your data analysis with R Markdown and knitr
Creating interactive web applications with shiny
Creating PDF presentations of your analysis with R presentation
Generating dynamic reports
Chapter 11: Work Smarter, Not Harder - Efficient and Elegant R Code
Introduction
Exploiting vectorized operations
Processing entire rows or columns using the apply function
Applying a function to all elements of a collection with lapply and sapply
Applying functions to subsets of a vector
Using the split-apply-combine strategy with plyr
Slicing, dicing, and combining data with data tables
Chapter 12: Where in the World? Geospatial Analysis
Introduction
Downloading and plotting a Google map of an area
Overlaying data on the downloaded Google map
Importing ESRI shape files to R
Using the sp package to plot geographic data
Getting maps from the maps package
Creating spatial data frames from regular data frames containing spatial and other data
Creating spatial data frames by combining regular data frames with spatial objects
Adding variables to an existing spatial data frame
Spatial data analysis with R and QGIS
Chapter 13: Playing Nice - Connecting to Other Systems
Introduction
Using Java objects in R
Using JRI to call R functions from Java
Using Rserve to call R functions from Java
Executing R scripts from Java
Using the xlsx package to connect to Excel
Reading data from relational databases - MySQL
Reading data from NoSQL databases - MongoDB
Working with in-memory data processing with Apache Spark

Book Details

ISBN 139781787124479
Paperback560 pages
Read More

Read More Reviews

Recommended for You

Clojure Data Analysis Cookbook - Second Edition Book Cover
Clojure Data Analysis Cookbook - Second Edition
$ 32.99
$ 23.10
R Data Analysis Solution - Analyzing Time-Series and Social Media Data, and More [Video] Book Cover
R Data Analysis Solution - Analyzing Time-Series and Social Media Data, and More [Video]
$ 124.99
$ 106.25
Python Data Analysis - Second Edition Book Cover
Python Data Analysis - Second Edition
$ 39.99
$ 28.00
Practical Data Analysis - Second Edition Book Cover
Practical Data Analysis - Second Edition
$ 39.99
$ 28.00
Network Analysis using Wireshark 2 Cookbook - Second Edition Book Cover
Network Analysis using Wireshark 2 Cookbook - Second Edition
$ 43.99
$ 30.80
Data Manipulation with R - Second Edition Book Cover
Data Manipulation with R - Second Edition
$ 23.99
$ 16.80