Getting ready
In this recipe, we’ll use a set of synthetic reads on the first 83 KB or so of the human genome chromosome 17. The reads were generated using the wgsim tool in samtools, an external command-line program. They have 64 single nucleotide polymorphisms (SNPs) introduced by wgsim, which can be seen in the snp_positions DataFrame that comes in rbioinfcookbook. We’ll use BAM and reference genome files that are stored in that package too, so we’ll need to install that along with the GenomicRanges, gmapR, rtracklayer, VariantAnnotation, and VariantTools Bioconductor packages, as well as the fs CRAN package.
How to do it…
Finding SNPs and insertions/deletions (INDELs) from sequence data using VariantTools can be done by performing the following steps:
- Import the required libraries:
library(GenomicRanges)library(gmapR)library(rtracklayer)library(VariantAnnotation)library(VariantTools)
- Then, load the datasets:
bam_file <- fs::path_package...