# Statistical Analysis with R

Formats:

save 15%!

save 37%!

**Free Shipping!**

Also available on: |

- An easy introduction for people who are new to R, with plenty of strong examples for you to work through
- This book will take you on a journey to learn R as the strategist for an ancient Chinese kingdom!
- A step by step guide to understand R, its benefits, and how to use it to maximize the impact of your data analysis
- A practical guide to conduct and communicate your data analysis with R in the most effective manner

### Book Details

**Language :**English

**Paperback :**300 pages [ 235mm x 191mm ]

**Release Date :**October 2010

**ISBN :**1849512086

**ISBN 13 :**9781849512084

**Author(s) :**John M. Quick

**Topics and Technologies :**All Books, Big Data and Business Intelligence, Data, Architecture & Analysis, Beginner's Guides, Open Source

## Table of Contents

PrefaceChapter 1: Uncovering the Strategist's Data Analysis Tool

Chapter 2: Preparing R for Battle

Chapter 3: Exploring the Mysterious Data Analysis Tool

Chapter 4: Collecting and Organizing Information

Chapter 5: Assessing the Situation

Chapter 6: Planning the Attack

Chapter 7: Organizing the Battle Plans

Chapter 8: Briefing the Emperor

Chapter 9: Briefing the Generals

Chapter 10: Becoming a Master Strategist

Appendix: Pop Quiz Answer Key

Index

- Chapter 1: Uncovering the Strategist's Data Analysis Tool
- What is R?
- What are the benefits of using R?
- Why should I use R?
- Why should I read this book?
- What topics are covered in this book?
- Chapter 2—Preparing R for Battle
- Chapter 3—Exploring the Mysterious Data Analysis Tool
- Chapter 4—Collecting and Organizing Information
- Chapter 5—Assessing the Situation
- Chapter 6—Planning the Attack
- Chapter 7—Organizing the Battle Plans
- Chapter 8—Briefing the Emperor
- Chapter 9—Briefing the Generals
- Chapter 10—Becoming a Master Strategist

- Summary

- Chapter 2: Preparing R for Battle
- Time for action – downloading and installing R
- Example: R 2.11.1 Mac OS X 10.5+ installation wizard demonstration

- Time for action – issuing your first R command
- Time for action – setting your R working directory
- Summary

- Time for action – downloading and installing R

- Chapter 3: Exploring the Mysterious Data Analysis Tool
- Deciphering Zhuge Liang's magic square
- Time for action – solving the first 4x4 magic square
- Lines
- Comments
- Calculations
- Output
- Visualizing the R console

- Summary

- Chapter 4: Collecting and Organizing Information
- Time for action – importing external data
- read.csv(file)
- comma-separated values (csv) files

- Time for action – creating and calling variables
- Time for action – accessing data within variables
- variable$column notation
- attach(variable) function
- variable[row, column] notation

- Time for action – manipulating variable data
- Performing a calculation on an entire dataset
- Performing a calculation on a row, column, or cell
- Using variable data in function arguments
- Saving a variable calculation into a new variable

- Time for action – managing the R workspace
- Listing the contents of the R workspace
- Saving the contents of the R workspace
- Loading the contents of the R workspace
- Quitting R
- Distinguishing between the R console and workspace
- Saving the R console

- Summary

- Time for action – importing external data

- Chapter 5: Assessing the Situation
- Time for action – making an initial inference from our data
- Examining our data
- Time for action – creating a subset from a large dataset
- Multi-argument functions
- Variable-argument functions
- Equivalency operators
- subset(data, ...)

- Time for action – deriving summary statistics
- Means
- Standard deviations
- Ranges
- summary(object)
- Why use summary statistics?

- Time for action – quantifying categorical variables
- as.numeric(data)
- Overwriting variables

- Time for action – correlating variables
- Interpreting correlations
- cor(x, y)
- cor(data)
- NA values

- Regression
- Time for action – modelling with simple linear regression
- lm(formula, data)
- Linear model output
- Linear model summary
- Interpreting a linear regression model

- Time for action – modelling with multiple linear regression
- Interpreting the summary output
- Explaining model differences

- Time for action – modelling interactions
- Interpreting interaction variables

- Time for action – comparing and choosing models
- Interpreting the model summaries
- Interpreting the ANOVA results

- anova(object, ...)

- Summary

- Chapter 6: Planning the Attack
- Review of models
- Head to head
- Surround
- Ambush
- Fire

- Predicting outcomes using regression models
- Rating
- Successfully executed
- Number of Wei soldiers
- Duration of battle
- A word about assumptions

- Time for action – calculating outcomes from regression models
- Time for action – creating custom functions
- function()
- Extended lines

- Time for action – creating resource-focused custom functions
- Logistical considerations
- Gold
- Provisions
- Equipment
- Soldiers
- Resource and cost summary
- Resource map

- Time for action – incorporating resource constraints into predictions
- Gold cost function explanation

- Assessing viability
- Time for action – assessing the viability of potential strategies
- Remember your assumptions

- Summary

- Review of models

- Chapter 7: Organizing the Battle Plans
- Retracing and refining a complete analysis
- Time for action – first steps
- Time for action – data setup
- read.table(...)

- Time for action – data exploration
- Time for action – model development
- glm(...)
- AIC(object, ...)

- Time for action – model deployment
- coef(object)

- Time for action – last steps
- The common steps to all R analyses
- Step 1: Set your working directory
- Comment your work

- Step 2: Import your data (or load an existing workspace)
- Step 3: Explore your data
- Step 4: Conduct your analysis
- Step 5: Save your workspace and console files

- Step 1: Set your working directory

- Summary

- Chapter 8: Briefing the Emperor
- Charts, graphs, and plots in R
- Time for action – creating a bar chart
- barplot(...)
- Vectors
- Graphic window

- Time for action – customizing graphics
- Graphic customization arguments
- main, xlab, and ylab
- xlim and ylim
- Col

- legend(...)

- Graphic customization arguments

- Time for action – creating a scatterplot
- Single scatterplot
- Multiple scatterplots

- Time for action – creating a line chart
- type
- Number-colon-number notation

- Time for action – creating a box plot
- boxplot(...)

- Time for action – creating a histogram
- hist(...)

- Time for action – creating a pie chart
- pie(...)

- Time for action – exporting graphics
- Summary

- Chapter 9: Briefing the Generals
- More charts, graphs, and plots in R
- Time for action – customizing a bar chart
- names
- width and space
- horiz
- beside
- density and angle
- legend(...) with density, angle, and cex

- Time for action – customizing a scatterplot
- pch and cex
- points(...)
- legend(...)
- abline(...)

- Time for action – customizing a line chart
- lwd
- lines(...)
- legend(...)

- Time for action – customizing a box plot
- range
- axis(...)

- Time for action – customizing a histogram
- breaks
- freq

- Time for action – customizing a pie chart
- Custom labels
- legend(...)

- Time for action – building a graphic
- Time for action – building a graphic with multiple visuals
- par(mfcol)
- Graphics
- Horizontal and vertical lines
- Nested functions

- Summary

- Chapter 10: Becoming a Master Strategist
- R's built-in resources
- Time for action – using R's help function
- help(...)

- Time for action – expanding R with packages
- Choose a CRAN mirror
- Install a package
- Load the package
- Use the package

- R's online resources
- Websites
- The R Project for Statistical Computing
- Quick-R
- R Programming wikibook
- R Graph Gallery
- Crantastic!

- Blogs
- R bloggers
- R Tutorial Series

- Online communities
- R-help mailing list
- Other mailing lists

- Search engines
- R Seek

- Websites

- Summary

- Appendix: Pop Quiz Answer Key
- Chapter 2
- Chapter 3
- Chapter 4
- Chapter 5
- Chapter 6
- Chapter 7
- Chapter 8
- Chapter 9
- Chapter 10

### John M. Quick

### Code Downloads

Download the code and support files for this book.

### Submit Errata

Please let us know if you have found any errors not listed on this list by completing our errata submission form. Our editors will check them and add them to this list. Thank you.

### Errata

- 9 submitted: last submission 08 Jan 2014**Errata type: Code | Page number: 70 | Errata date: 01 Nov 10**

The code for standard deviation of duration in past head to head conflicts reads as:

sdDurationHeadToHead <- mean(subsetHeadToHead$DurationInDays)

The correct code should be:

sdDurationHeadToHead <- sd(subsetHeadToHead$DurationInDays)

**Errata type: Grammar | Page number: 151 | Errata date: 09 Nov 10**

The last line of the Introduction reads as:

"It is important to convey your plans with clarity, because the emperor has the power accept or reject your strategy."

Actually it should read as:

"It is important to convey your plans with clarity, because the emperor has the power to accept or reject your strategy."

**Errata type: Others | Page number: 50 | Errata date: 16 Nov 10**

The line that reads:

to access the city column in soldiersByCity, we could use soldiersByCity[,1], this tells R to retrieve every row within the City column

Should actually say to use soldiersByCity[,2] in order to retrieve the City column

**Errata type: Others | Page number: 76 | Errata date: 17 Nov 10**

Incorrect variable name in :

So, after step 3, our numericSuccessfullyExecutedHeadToHead variable

Should be :

So, after step 3, our numericExecutionHeadToHead variable

**Errata type: Others | Page number: 131 | Errata date: 21 Nov 10**

Incorrect mean value in:

"The rating of the Shu army's performance in attacks has ranged from 10 to 100, with a mean of 45"

Should be:

"The rating of the Shu army's performance in attacks has ranged from 10 to 100, with a mean of 52"

**Errata type: code | Page number: 54**

mean(soldiersByCity$Soldiers) returns the mean of the entire Soldiers column not just the Shu soldiers.

As for finding the mean for just the Shu soldiers, the subset() function, which is discussed in chapter 5, can be used in tandem with the mean() function. For example, first subset the data to include only the Shu cities, then calculate the mean of the Soldiers column: > shuData <- subset(soldiersByCity, Kingdom == "Shu") > mean(shuData$Soldiers)

**Errata type: code | Page number: 70**

sdDurationHeadToHead

[1] 77.93333

should be

> sdDurationHeadToHead

[1] **29.86398**

R Beginner's Guide

John M. Quick Answers to Pop Quiz sections by chapter

Chapter 2

Setting your R working directory

1. d

2. b

3. d

Chapter 3

Solving the First 4x4 Magic Square

1. b

2. d

3. c

Chapter 4

Importing external data

1. b

Creating and calling variables

1. d

2. a

Accessing data within variables

1. d

2. b

3. d

Manipulating variable data

1. Table values (left to right, top to bottom): 10, 20, 30, 40, 50, 60

2. Table values (left to right, top to bottom): 1, 12, 3, 4, 5, 6

3. c

Managing the R workspace

1. c

2. d

3. a

Chapter 5

Creating a subset from a large dataset

1. d

2. a

Deriving summary statistics

1. c

2. b

Quantifying categorical variables

1. d

2. a

Correlating variables

1. b

2. d

Modeling with simple linear regression

1. b

2. a

3. c

Modeling with multiple linear regression

1. a

2. c

Modeling interactions

1. b

2. d

3. a

Comparing and choosing models

1. d

Chapter 6

Creating custom functions

1. c

2. b

3. c

Incorporating resource constraints into predictions

1. d

Assessing the viability of potential strategies

1. d

Chapter 7

Data setup

1. c

Data exploration

1. c

Model development

1. d

2. d

Model deployment

1. c

The common steps to all R analyses

1. d

Chapter 8

Creating a bar chart

1. a

2. d

Customizing graphics

1. d

2. d

Creating a scatterplot

1. a

2. c

Creating a line chart

1. d

2. a

Creating a box plot

1. a

2. c

Creating a histogram

1. d

Creating a pie chart

1. b

Exporting graphics

1. d

Chapter 9

Customizing a bar chart

1. b

2. d

3. b

Customizing a scatterplot

1. a

2. c

Customizing a line chart

1. d

2. c

Customizing a box plot

1. b

2. c

Customizing a histogram

1. b

2. c

Customizing a pie chart

1. a

2. c

Building a graphic

1. a

2. d

Building a graphic with multiple visuals

1. c

2. c

3. c

Chapter 10

Using R's help function

1. c

2. b

Expanding R with packages

1. b

2. a

3. b

**Errata type: Technical | Page number: 78**

**In bullet number 5: **

corHeadToHead <- cor(subsetHeadToHead) doesn't work as the newest version of R doesn't allow for(data) where data has non numeric elements.

Solution is as follows:

cor(subsetHeadToHead[sapply(subsetHeadToHead,is.numeric)])

### Sample chapters

You can view our sample chapters and prefaces of this title on PacktLib or download sample chapters in PDF format.

- Conduct superior data analysis in R
- Organize and communicate data analysis
- Generate, customize, and export detailed charts, plots, and graphs
- Build your own custom data visualizations
- Program in the R language
- Create your own custom functions
- Extend the functionality of R via external packages
- Manage the R workspace and console
- Import external data into R
- Manipulate data using variables
- Execute a wide array of multi-argument and variable-argument functions
- Develop and employ predictive regression models
- Assess the practical and statistical significance of predictions
- Understand R, its benefits, and how to use it to maximize the impact of your data analyses

R is a data analysis tool, graphical environment, and programming language. Without any prior experience in programming or statistical software, this book will help you quickly become a knowledgeable user of R. Now is the time to take control of your data and start producing superior statistical analysis with R.

This book will take you on a journey as the strategist for an ancient Chinese kingdom. Along the way, you will learn how to use R to arrive at practical solutions and how to effectively communicate your results. Ultimately, the fate of the kingdom depends on your ability to make informed, data-driven decisions with R.

You have unexpectedly been thrust into the role of lead strategist for the kingdom. After you install your predecessor's mysterious data analysis tool, you will begin to explore its fundamental elements. Next, you will use R to import and organize your data. Then, you will use functions and statistical analysis to arrive at potential courses of action. Subsequently, you will design your own functions to assess the practical impacts of your predictions. Lastly, you will focus on communicating your results through the use of charts, plots, graphs, and custom built visualizations. The fate of the kingdom is in your hands. Your rapid development as a master R strategist is the key to future success.

A step by step guide to organize, analyze, and visualize your data in R.

This is a practical, step by step guide that will help you to quickly become proficient in the data analysis using R. The book is packed with clear examples, screenshots, and code to carry on your data analysis without any hurdle.

If you are a data analyst, business or information technology professional, student, educator, researcher, or anyone else who wants to learn to analyze the data effectively then this book is for you.

No prior experience with R is necessary. Knowledge of other programming languages, software packages, or statistics may be helpful, but is not required.