Data Manipulation with R


Data Manipulation with R
eBook: $17.99
Formats: PDF, PacktLib, ePub and Mobi formats
$15.29
save 15%!
Print + free eBook + free PacktLib access to the book: $47.98    Print cover: $29.99
$29.99
save 37%!
Free Shipping!
UK, US, Europe and selected countries in Asia.
Also available on:
Overview
Table of Contents
Author
Support
Sample Chapters
  • Perform factor manipulation and string processing
  • Learn group-wise data manipulation using plyr
  • Handle large datasets, interact with database software, and manipulate data using sqldf

Book Details

Language : English
Paperback : 102 pages [ 235mm x 191mm ]
Release Date : January 2014
ISBN : 178328109X
ISBN 13 : 9781783281091
Author(s) : Jaynal Abedin
Topics and Technologies : All Books, Big Data and Business Intelligence, Open Source


Table of Contents

Preface
Chapter 1: R Data Types and Basic Operations
Chapter 2: Basic Data Manipulation
Chapter 3: Data Manipulation Using plyr
Chapter 4: Reshaping Datasets
Chapter 5: R and Databases
Bibliography
Index
  • Chapter 2: Basic Data Manipulation
    • Acquiring data
    • Factor manipulation
      • Factors from numeric variables
    • Date processing
    • Character manipulation
    • Subscripting and subsetting
    • Summary
  • Chapter 3: Data Manipulation Using plyr
    • The split-apply-combine strategy
      • Split-apply-combine without a loop
      • Split-apply-combine with a loop
    • Utilities of plyr
      • Intuitive function names
      • Input and arguments
    • Comparing default R and plyr
      • Multiargument functions
    • Summary
  • Chapter 4: Reshaping Datasets
    • The typical layout of a dataset
      • Long layout
      • Wide layout
    • The new layout of a dataset
    • Reshaping the dataset from the typical layout
    • Reshaping the dataset with the reshape package
      • Melting data
        • Missing values in molten data
      • Casting molten data
    • The reshape2 package
    • Summary
  • Chapter 5: R and Databases
    • R and different databases
      • R and Excel
      • R and MS Access
    • Relational databases in R
      • The filehash package
      • The ff package
    • R and sqldf
    • Data manipulation using sqldf
    • Summary

Jaynal Abedin

Jaynal Abedin  currently holds the position of Statistician at the Centre for Communicable Diseases (CCD) at icddr,b ( www.icddrb.org). He attained his Bachelor's and Master's degrees in Statistics from the University of Rajshahi, Rajshahi, Bangladesh. He has vast experience in R programming and Stata and has efficient leadership qualities. He is currently leading a team of statisticians. He has hands-on experience in developing training material and facilitating training in R programming and Stata along with statistical aspects in public health research. His primary area of interest in research includes causal inference and machine learning. He is currently involved in several ongoing public health research projects and is a co-author of several work-in-progress manuscripts. In the useR! Conference 2013, he presented a poster—edeR: Email Data Extraction using R, available at http://www.edii.uclm.es/~useR-2013/abstracts/files/34_edeR_Email_Data_Extraction_using_R.pdf—and obtained the best application poster award. He is also involved in reviewing scientific manuscripts for the Journal of Applied Statistics (JAS) and the Journal of Health Population and Nutrition (JHPN). He is  also a successful freelance statistician on online platforms and has an excellent reputation through his high-quality work, especially in R programming. He can be contacted at  joystatru@gmail.com,  http://bd.linkedin.com/in/jaynal; his Twitter handle is  @jaynal83.

Sorry, we don't have any reviews for this title yet.

Code Downloads

Download the code and support files for this book.


Submit Errata

Please let us know if you have found any errors not listed on this list by completing our errata submission form. Our editors will check them and add them to this list. Thank you.

Sample chapters

You can view our sample chapters and prefaces of this title on PacktLib or download sample chapters in PDF format.

Frequently bought together

Data Manipulation with R +    MySQL Admin Cookbook =
50% Off
the second eBook
Price for both: £22.14

Buy both these recommended eBooks together and get 50% off the cheapest eBook.

What you will learn from this book

  • Learn R data types and their basic operations
  • Deal efficiently with string, factor, and date
  • Understand group-wise data manipulation
  • Work with different layouts of the R dataset and interchange between layouts for different purposes
  • Connect R with database software to manage relational databases
  • Manage bigger datasets using R
  • Manipulate datasets using SQL statements through the sqldf package

In Detail

One of the most important aspects of computing with data is the ability to manipulate it to enable subsequent analysis and visualization. R offers a wide range of tools for this purpose. Data from any source, be it flat files or databases, can be loaded into R and this will allow you to manipulate data format into structures that support reproducible and convenient data analysis.

This practical, example-oriented guide aims to discuss the split-apply-combine strategy in data manipulation, which is a faster data manipulation approach. After reading this book, you will not only be able to efficiently manage and check the validity of your datasets with the split-apply-combine strategy, but you will also learn to handle larger datasets.

This book starts with describing the R object’s mode and class, and then highlights different R data types, explaining their basic operations. You will focus on group-wise data manipulation with the split-apply-combine strategy, supported by specific examples. You will also learn to efficiently handle date, string, and factor variables along with different layouts of datasets using the reshape2 package. You will learn to use plyr effectively for data manipulation, truncating and rounding data, simulating data sets, as well as character manipulation. Finally you will get acquainted with using R with SQL databases.

Approach

This book is a step-by step, example-oriented tutorial that will show both intermediate and advanced users how data manipulation is facilitated smoothly using R.

Who this book is for

This book is aimed at intermediate to advanced level users of R who want to perform data manipulation with R, and those who want to clean and aggregate data effectively. Readers are expected to have at least an introductory knowledge of R and some basic administration work in R, such as installing packages and calling them when required.

Code Download and Errata
Packt Anytime, Anywhere
Register Books
Print Upgrades
eBook Downloads
Video Support
Contact Us
Awards Voting Nominations Previous Winners
Judges Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software
Resources
Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software