Statistical Analysis with R


Statistical Analysis with R
eBook: $26.99
Formats: PDF, PacktLib, ePub and Mobi formats
$22.94
save 15%!
Print + free eBook + free PacktLib access to the book: $71.98    Print cover: $44.99
$44.99
save 37%!
Free Shipping!
UK, US, Europe and selected countries in Asia.
Also available on:
Overview
Table of Contents
Author
Support
Sample Chapters
  • An easy introduction for people who are new to R, with plenty of strong examples for you to work through
  • This book will take you on a journey to learn R as the strategist for an ancient Chinese kingdom!
  • A step by step guide to understand R, its benefits, and how to use it to maximize the impact of your data analysis
  • A practical guide to conduct and communicate your data analysis with R in the most effective manner

 

Book Details

Language : English
Paperback : 300 pages [ 235mm x 191mm ]
Release Date : October 2010
ISBN : 1849512086
ISBN 13 : 9781849512084
Author(s) : John M. Quick
Topics and Technologies : All Books, Big Data and Business Intelligence, Data, Architecture & Analysis, Beginner's Guides, Open Source


Table of Contents

Preface
Chapter 1: Uncovering the Strategist's Data Analysis Tool
Chapter 2: Preparing R for Battle
Chapter 3: Exploring the Mysterious Data Analysis Tool
Chapter 4: Collecting and Organizing Information
Chapter 5: Assessing the Situation
Chapter 6: Planning the Attack
Chapter 7: Organizing the Battle Plans
Chapter 8: Briefing the Emperor
Chapter 9: Briefing the Generals
Chapter 10: Becoming a Master Strategist
Appendix: Pop Quiz Answer Key
Index
  • Chapter 1: Uncovering the Strategist's Data Analysis Tool
    • What is R?
    • What are the benefits of using R?
    • Why should I use R?
    • Why should I read this book?
    • What topics are covered in this book?
      • Chapter 2—Preparing R for Battle
      • Chapter 3—Exploring the Mysterious Data Analysis Tool
      • Chapter 4—Collecting and Organizing Information
      • Chapter 5—Assessing the Situation
      • Chapter 6—Planning the Attack
      • Chapter 7—Organizing the Battle Plans
      • Chapter 8—Briefing the Emperor
      • Chapter 9—Briefing the Generals
      • Chapter 10—Becoming a Master Strategist
    • Summary
  • Chapter 2: Preparing R for Battle
    • Time for action – downloading and installing R
      • Example: R 2.11.1 Mac OS X 10.5+ installation wizard demonstration
    • Time for action – issuing your first R command
    • Time for action – setting your R working directory
    • Summary
  • Chapter 4: Collecting and Organizing Information
    • Time for action – importing external data
      • read.csv(file)
      • comma-separated values (csv) files
    • Time for action – creating and calling variables
    • Time for action – accessing data within variables
      • variable$column notation
      • attach(variable) function
      • variable[row, column] notation
    • Time for action – manipulating variable data
      • Performing a calculation on an entire dataset
      • Performing a calculation on a row, column, or cell
      • Using variable data in function arguments
      • Saving a variable calculation into a new variable
    • Time for action – managing the R workspace
      • Listing the contents of the R workspace
      • Saving the contents of the R workspace
      • Loading the contents of the R workspace
      • Quitting R
      • Distinguishing between the R console and workspace
      • Saving the R console
    • Summary
  • Chapter 5: Assessing the Situation
    • Time for action – making an initial inference from our data
    • Examining our data
    • Time for action – creating a subset from a large dataset
      • Multi-argument functions
      • Variable-argument functions
      • Equivalency operators
      • subset(data, ...)
    • Time for action – deriving summary statistics
      • Means
      • Standard deviations
      • Ranges
      • summary(object)
      • Why use summary statistics?
    • Time for action – quantifying categorical variables
      • as.numeric(data)
      • Overwriting variables
    • Time for action – correlating variables
      • Interpreting correlations
      • cor(x, y)
      • cor(data)
      • NA values
    • Regression
    • Time for action – modelling with simple linear regression
      • lm(formula, data)
      • Linear model output
      • Linear model summary
      • Interpreting a linear regression model
    • Time for action – modelling with multiple linear regression
      • Interpreting the summary output
      • Explaining model differences
    • Time for action – modelling interactions
      • Interpreting interaction variables
    • Time for action – comparing and choosing models
      • Interpreting the model summaries
      • Interpreting the ANOVA results
    • anova(object, ...)
  • Summary
  • Chapter 6: Planning the Attack
    • Review of models
      • Head to head
      • Surround
      • Ambush
      • Fire
    • Predicting outcomes using regression models
      • Rating
      • Successfully executed
      • Number of Wei soldiers
      • Duration of battle
      • A word about assumptions
    • Time for action – calculating outcomes from regression models
    • Time for action – creating custom functions
      • function()
      • Extended lines
    • Time for action – creating resource-focused custom functions
    • Logistical considerations
      • Gold
      • Provisions
      • Equipment
      • Soldiers
      • Resource and cost summary
      • Resource map
    • Time for action – incorporating resource constraints into predictions
      • Gold cost function explanation
    • Assessing viability
    • Time for action – assessing the viability of potential strategies
      • Remember your assumptions
    • Summary
  • Chapter 7: Organizing the Battle Plans
    • Retracing and refining a complete analysis
    • Time for action – first steps
    • Time for action – data setup
      • read.table(...)
    • Time for action – data exploration
    • Time for action – model development
      • glm(...)
      • AIC(object, ...)
    • Time for action – model deployment
      • coef(object)
    • Time for action – last steps
    • The common steps to all R analyses
      • Step 1: Set your working directory
        • Comment your work
      • Step 2: Import your data (or load an existing workspace)
      • Step 3: Explore your data
      • Step 4: Conduct your analysis
      • Step 5: Save your workspace and console files
    • Summary
  • Chapter 8: Briefing the Emperor
    • Charts, graphs, and plots in R
    • Time for action – creating a bar chart
      • barplot(...)
      • Vectors
      • Graphic window
    • Time for action – customizing graphics
      • Graphic customization arguments
        • main, xlab, and ylab
        • xlim and ylim
        • Col
      • legend(...)
    • Time for action – creating a scatterplot
      • Single scatterplot
      • Multiple scatterplots
    • Time for action – creating a line chart
      • type
      • Number-colon-number notation
    • Time for action – creating a box plot
      • boxplot(...)
    • Time for action – creating a histogram
      • hist(...)
    • Time for action – creating a pie chart
      • pie(...)
    • Time for action – exporting graphics
    • Summary
  • Chapter 9: Briefing the Generals
    • More charts, graphs, and plots in R
    • Time for action – customizing a bar chart
      • names
      • width and space
      • horiz
      • beside
      • density and angle
      • legend(...) with density, angle, and cex
    • Time for action – customizing a scatterplot
      • pch and cex
      • points(...)
      • legend(...)
      • abline(...)
    • Time for action – customizing a line chart
      • lwd
      • lines(...)
      • legend(...)
    • Time for action – customizing a box plot
      • range
      • axis(...)
    • Time for action – customizing a histogram
      • breaks
      • freq
    • Time for action – customizing a pie chart
      • Custom labels
      • legend(...)
    • Time for action – building a graphic
    • Time for action – building a graphic with multiple visuals
      • par(mfcol)
      • Graphics
        • Horizontal and vertical lines
        • Nested functions
    • Summary
  • Chapter 10: Becoming a Master Strategist
    • R's built-in resources
    • Time for action – using R's help function
      • help(...)
    • Time for action – expanding R with packages
      • Choose a CRAN mirror
      • Install a package
      • Load the package
      • Use the package
    • R's online resources
      • Websites
        • The R Project for Statistical Computing
        • Quick-R
        • R Programming wikibook
        • R Graph Gallery
        • Crantastic!
      • Blogs
        • R bloggers
        • R Tutorial Series
      • Online communities
        • R-help mailing list
        • Other mailing lists
      • Search engines
        • R Seek
        • Google
    • Summary

John M. Quick

He is an Educational Technology doctoral student at Arizona State University who is interested in the design, research, and use of educational innovations. Currently, his work focuses on mobile, game-based, and global learning, interactive mixed-reality systems, and innovation adoption. John's blog, which provides articles, tutorials, reviews, perspectives, and news relevant to technology and education, is available from http://www.johnmquick.com. In his spare time, John enjoys photography, nature, and travel.

Sorry, we don't have any reviews for this title yet.

Code Downloads

Download the code and support files for this book.


Submit Errata

Please let us know if you have found any errors not listed on this list by completing our errata submission form. Our editors will check them and add them to this list. Thank you.


Errata

- 9 submitted: last submission 08 Jan 2014

Errata type: Code | Page number: 70 | Errata date: 01 Nov 10

The code for standard deviation of duration in past head to head conflicts reads as:
sdDurationHeadToHead <- mean(subsetHeadToHead$DurationInDays)

The correct code should be:
sdDurationHeadToHead <- sd(subsetHeadToHead$DurationInDays)

 

Errata type: Grammar | Page number: 151 | Errata date: 09 Nov 10

The last line of the Introduction reads as:
"It is important to convey your plans with clarity, because the emperor has the power accept or reject your strategy."

Actually it should read as:
"It is important to convey your plans with clarity, because the emperor has the power to accept or reject your strategy."

 

Errata type: Others | Page number: 50 | Errata date: 16 Nov 10

The line that reads:
to access the city column in soldiersByCity, we could use soldiersByCity[,1], this tells R to retrieve every row within the City column

Should actually say to use soldiersByCity[,2] in order to retrieve the City column

 

Errata type: Others | Page number: 76 | Errata date: 17 Nov 10

Incorrect variable name in :
So, after step 3, our numericSuccessfullyExecutedHeadToHead variable

Should be :
So, after step 3, our numericExecutionHeadToHead variable

 

Errata type: Others | Page number: 131 | Errata date: 21 Nov 10

Incorrect mean value in:
"The rating of the Shu army's performance in attacks has ranged from 10 to 100, with a mean of 45"

Should be:
"The rating of the Shu army's performance in attacks has ranged from 10 to 100, with a mean of 52"

 

Errata type: code | Page number: 54

mean(soldiersByCity$Soldiers) returns the mean of the entire Soldiers column not just the Shu soldiers.

As for finding the mean for just the Shu soldiers, the subset() function, which is discussed in chapter 5, can be used in tandem with the mean() function. For example, first subset the data to include only the Shu cities, then calculate the mean of the Soldiers column: > shuData <- subset(soldiersByCity, Kingdom == "Shu") > mean(shuData$Soldiers)

Errata type: code | Page number: 70

sdDurationHeadToHead

[1] 77.93333

should be

> sdDurationHeadToHead

[1] 29.86398

R Beginner's Guide
John M. Quick Answers to Pop Quiz sections by chapter
Chapter 2
Setting your R working directory
1. d
2. b
3. d
Chapter 3
Solving the First 4x4 Magic Square
1. b
2. d
3. c
Chapter 4
Importing external data
1. b
Creating and calling variables
1. d
2. a
Accessing data within variables
1. d
2. b
3. d
Manipulating variable data
1. Table values (left to right, top to bottom): 10, 20, 30, 40, 50, 60
2. Table values (left to right, top to bottom): 1, 12, 3, 4, 5, 6
3. c
Managing the R workspace
1. c
2. d
3. a
Chapter 5
Creating a subset from a large dataset
1. d
2. a
Deriving summary statistics
1. c
2. b
Quantifying categorical variables
1. d
2. a
Correlating variables
1. b
2. d
Modeling with simple linear regression
1. b
2. a
3. c
Modeling with multiple linear regression
1. a
2. c
Modeling interactions
1. b
2. d
3. a
Comparing and choosing models
1. d
Chapter 6
Creating custom functions
1. c
2. b
3. c
Incorporating resource constraints into predictions
1. d
Assessing the viability of potential strategies
1. d
Chapter 7
Data setup
1. c
Data exploration
1. c
Model development
1. d
2. d
Model deployment
1. c
The common steps to all R analyses
1. d
Chapter 8
Creating a bar chart
1. a
2. d
Customizing graphics
1. d
2. d
Creating a scatterplot
1. a
2. c
Creating a line chart
1. d
2. a
Creating a box plot
1. a
2. c
Creating a histogram
1. d
Creating a pie chart
1. b
Exporting graphics
1. d
Chapter 9
Customizing a bar chart
1. b
2. d
3. b
Customizing a scatterplot
1. a
2. c
Customizing a line chart
1. d
2. c
Customizing a box plot
1. b
2. c
Customizing a histogram
1. b
2. c
Customizing a pie chart
1. a
2. c
Building a graphic
1. a
2. d
Building a graphic with multiple visuals
1. c
2. c
3. c
Chapter 10
Using R's help function
1. c
2. b
Expanding R with packages
1. b
2. a
3. b

Errata type: Technical | Page number: 78

In bullet number 5:

corHeadToHead <- cor(subsetHeadToHead) doesn't work as the newest version of R doesn't allow for(data) where data has non numeric elements.

Solution is as follows:

cor(subsetHeadToHead[sapply(subsetHeadToHead,is.numeric)])

Sample chapters

You can view our sample chapters and prefaces of this title on PacktLib or download sample chapters in PDF format.

Frequently bought together

Statistical Analysis with R +    Building Machine Learning Systems with Python =
50% Off
the second eBook
Price for both: $39.00

Buy both these recommended eBooks together and get 50% off the cheapest eBook.

What you will learn from this book

  • Conduct superior data analysis in R
  • Organize and communicate data analysis
  • Generate, customize, and export detailed charts, plots, and graphs
  • Build your own custom data visualizations
  • Program in the R language
  • Create your own custom functions
  • Extend the functionality of R via external packages
  • Manage the R workspace and console
  • Import external data into R
  • Manipulate data using variables
  • Execute a wide array of multi-argument and variable-argument functions
  • Develop and employ predictive regression models
  • Assess the practical and statistical significance of predictions
  • Understand R, its benefits, and how to use it to maximize the impact of your data analyses

In Detail

R is a data analysis tool, graphical environment, and programming language. Without any prior experience in programming or statistical software, this book will help you quickly become a knowledgeable user of R. Now is the time to take control of your data and start producing superior statistical analysis with R.

This book will take you on a journey as the strategist for an ancient Chinese kingdom. Along the way, you will learn how to use R to arrive at practical solutions and how to effectively communicate your results. Ultimately, the fate of the kingdom depends on your ability to make informed, data-driven decisions with R.

You have unexpectedly been thrust into the role of lead strategist for the kingdom. After you install your predecessor's mysterious data analysis tool, you will begin to explore its fundamental elements. Next, you will use R to import and organize your data. Then, you will use functions and statistical analysis to arrive at potential courses of action. Subsequently, you will design your own functions to assess the practical impacts of your predictions. Lastly, you will focus on communicating your results through the use of charts, plots, graphs, and custom built visualizations. The fate of the kingdom is in your hands. Your rapid development as a master R strategist is the key to future success.

A step by step guide to organize, analyze, and visualize your data in R.

Approach

This is a practical, step by step guide that will help you to quickly become proficient in the data analysis using R. The book is packed with clear examples, screenshots, and code to carry on your data analysis without any hurdle.

Who this book is for

If you are a data analyst, business or information technology professional, student, educator, researcher, or anyone else who wants to learn to analyze the data effectively then this book is for you.

No prior experience with R is necessary. Knowledge of other programming languages, software packages, or statistics may be helpful, but is not required.

Code Download and Errata
Packt Anytime, Anywhere
Register Books
Print Upgrades
eBook Downloads
Video Support
Contact Us
Awards Voting Nominations Previous Winners
Judges Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software
Resources
Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software