Bioinformatics with R Cookbook

Bioinformatics with R Cookbook
eBook: $32.99
Formats: PDF, PacktLib, ePub and Mobi formats
save 15%!
Print + free eBook + free PacktLib access to the book: $87.98    Print cover: $54.99
save 37%!
Free Shipping!
UK, US, Europe and selected countries in Asia.
Also available on:
Table of Contents
Sample Chapters
  • Use the existing R-packages to handle biological data
  • Represent biological data with attractive visualizations
  • An easy-to-follow guide to handle real-life problems in Bioinformatics like Next Generation Sequencing and Microarray Analysis

Book Details

Language : English
Paperback : 340 pages [ 235mm x 191mm ]
Release Date : June 2014
ISBN : 1783283130
ISBN 13 : 9781783283132
Author(s) : Paurush Praveen Sinha
Topics and Technologies : All Books, Cookbooks, Open Source

Table of Contents

Chapter 1: Starting Bioinformatics with R
Chapter 2: Introduction to Bioconductor
Chapter 3: Sequence Analysis with R
Chapter 4: Protein Structure Analysis with R
Chapter 5: Analyzing Microarray Data with R
Chapter 6: Analyzing GWAS Data
Chapter 7: Analyzing Mass Spectrometry Data
Chapter 8: Analyzing NGS Data
Chapter 9: Machine Learning in Bioinformatics
Appendix A: Useful Operators and Functions in R
Appendix B: Useful R Packages
  • Chapter 1: Starting Bioinformatics with R
    • Introduction
    • Getting started and installing libraries
    • Reading and writing data
    • Filtering and subsetting data
    • Basic statistical operations on data
    • Generating probability distributions
    • Performing statistical tests on data
    • Visualizing data
    • Working with PubMed in R
    • Retrieving data from BioMart
  • Chapter 2: Introduction to Bioconductor
    • Introduction
    • Installing packages from Bioconductor
    • Handling annotation databases in R
    • Performing ID conversions
    • The KEGG annotation of genes
    • The GO annotation of genes
    • The GO enrichment of genes
    • The KEGG enrichment of genes
    • Bioconductor in the cloud
  • Chapter 3: Sequence Analysis with R
    • Introduction
    • Retrieving a sequence
    • Reading and writing the FASTA file
    • Getting the detail of a sequence composition
    • Pairwise sequence alignment
    • Multiple sequence alignment
    • Phylogenetic analysis and tree plotting
    • Handling BLAST results
    • Pattern finding in a sequence
  • Chapter 4: Protein Structure Analysis with R
    • Introduction
    • Retrieving a sequence from UniProt
    • Protein sequence analysis
    • Computing the features of a protein sequence
    • Handling the PDB file
    • Working with the InterPro domain annotation
    • Understanding the Ramachandran plot
    • Searching for similar proteins
    • Working with the secondary structure features of proteins
    • Visualizing the protein structures
  • Chapter 5: Analyzing Microarray Data with R
    • Introduction
    • Reading CEL files
    • Building the ExpressionSet object
    • Handling the AffyBatch object
    • Checking the quality of data
    • Generating artificial expression data
    • Data normalization
    • Overcoming batch effects in expression data
    • An exploratory analysis of data with PCA
    • Finding the differentially expressed genes
    • Working with the data of multiple classes
    • Handling time series data
    • Fold changes in microarray data
    • The functional enrichment of data
    • Clustering microarray data
    • Getting a co-expression network from microarray data
    • More visualizations for gene expression data
  • Chapter 6: Analyzing GWAS Data
    • Introduction
    • The SNP association analysis
    • Running association scans for SNPs
    • The whole genome SNP association analysis
    • Importing PLINK GWAS data
    • Data handling with the GWASTools package
    • Manipulating other GWAS data formats
    • The SNP annotation and enrichment
    • Testing data for the Hardy-Weinberg equilibrium
    • Association tests with CNV data
    • Visualizations in GWAS studies
  • Chapter 7: Analyzing Mass Spectrometry Data
    • Introduction
    • Reading the MS data of the mzXML/mzML format
    • Reading the MS data of the Bruker format
    • Converting the MS data in the mzXML format to MALDIquant
    • Extracting data elements from the MS data object
    • Preprocessing MS data
    • Peak detection in MS data
    • Peak alignment with MS data
    • Peptide identification in MS data
    • Performing protein quantification analysis
    • Performing multiple groups' analysis in MS data
    • Useful visualizations for MS data analysis
  • Chapter 8: Analyzing NGS Data
    • Introduction
    • Querying the SRA database
    • Downloading data from the SRA database
    • Reading FASTQ files in R
    • Reading alignment data
    • Preprocessing the raw NGS data
    • Analyzing RNAseq data with the edgeR package
    • The differential analysis of NGS data using limma
    • Enriching RNAseq data with GO terms
    • The KEGG enrichment of sequence data
    • Analyzing methylation data
    • Analyzing ChipSeq data
    • Visualizations for NGS data
  • Chapter 9: Machine Learning in Bioinformatics
    • Introduction
    • Data clustering in R using k-means and hierarchical clustering
    • Visualizing clusters
    • Supervised learning for classification
    • Probabilistic learning in R with Naïve Bayes
    • Bootstrapping in machine learning
    • Cross-validation for classifiers
    • Measuring the performance of classifiers
    • Visualizing an ROC curve in R
    • Biomarker identification using array data

Paurush Praveen Sinha

Paurush Praveen Sinha has been working with R for the past seven years. An engineer by training, he got into the world of bioinformatics and R when he started working as a research assistant at the Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Germany. Later, during his doctorate, he developed and applied various machine learning approaches with the extensive use of R to analyze and infer from biological data. Besides R, he has experience in various other programming languages, which include Java, C, and MATLAB. During his experience with R, he contributed to several existing R packages and is working on the release of some new packages that focus on machine learning and bioinformatics. In late 2013, he joined the Microsoft Research-University of Trento COSBI in Italy as a researcher. He uses R as the backend engine for developing various utilities and machine learning methods to address problems in bioinformatics.

Sorry, we don't have any reviews for this title yet.

Code Downloads

Download the code and support files for this book.

Submit Errata

Please let us know if you have found any errors not listed on this list by completing our errata submission form. Our editors will check them and add them to this list. Thank you.


- 4 submitted: last submission 25 Jul 2014

Errata type: Code Related | Page No: 31

It is: > cancer <- EUtilsSummary("cancer[ti]", type="research", 


It should be: > cancer <- EUtilsSummary("cancer[ti]", type="esearch", 


Errata type: Code related | Page no: 12


It is: > load(package_name)

It should be: > library(package_name)

Errata type: Code related | Page no: 14

It is: > install.packages("xlsx", dependencies=TRUE)

       > library(gdata)

       > mydata <- read.xls("mydata.xls")

It should be: > library(xlsx)

    > library(xlsx)

    > mydata <- read.xlsx("mydata.xlsx")


Errata type: Code related | Page no: 16

It is: > install.packages(WriteXLS)

        > library(WriteXLS)

        > WriteXLS(x, ExcelFileName = "R.xls")



It should be: > install.packages("WriteXLS")

                   > library(WriteXLS)

                   > WriteXLS(x, ExcelFileName = "R.xls")


Sample chapters

You can view our sample chapters and prefaces of this title on PacktLib or download sample chapters in PDF format.

Frequently bought together

Bioinformatics with R Cookbook +    JIRA 5.2 Essentials =
50% Off
the second eBook
Price for both: $49.65

Buy both these recommended eBooks together and get 50% off the cheapest eBook.

What you will learn from this book

  • Retrieve biological data from within an R environment without hassling web pages
  • Annotate and enrich your data and convert the identifiers
  • Find relevant text from PubMed on which to perform text mining
  • Find phylogenetic relations between species
  • Infer relations between genomic content and diseases via GWAS
  • Classify patients based on biological or clinical features
  • Represent biological data with attractive visualizations, useful for publications and presentations

In Detail

Bioinformatics is an interdisciplinary field that develops and improves upon the methods for storing, retrieving, organizing, and analyzing biological data. R is the primary language used for handling most of the data analysis work done in the domain of bioinformatics.

Bioinformatics with R Cookbook is a hands-on guide that provides you with a number of recipes offering you solutions to all the computational tasks related to bioinformatics in terms of packages and tested codes.

With the help of this book, you will learn how to analyze biological data using R, allowing you to infer new knowledge from your data coming from different types of experiments stretching from microarray to NGS and mass spectrometry.


This book is an easy-to-follow, stepwise guide to handle real life Bioinformatics problems. Each recipe comes with a detailed explanation to the solution steps. A systematic approach, coupled with lots of illustrations, tips, and tricks will help you as a reader grasp even the trickiest of concepts without difficulty.

Who this book is for

This book is ideal for computational biologists and bioinformaticians with basic knowledge of R programming, bioinformatics and statistics. If you want to understand various critical concepts needed to develop your computational models in Bioinformatics, then this book is for you. Basic knowledge of R is expected.

Code Download and Errata
Packt Anytime, Anywhere
Register Books
Print Upgrades
eBook Downloads
Video Support
Contact Us
Awards Voting Nominations Previous Winners
Judges Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software
Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software