Tools for sequence manipulation
In this recipe, we will learn about fundamental tasks for sequencing manipulation that are part of the core toolkit of any bioinformatician. We’ll cover basic operations such as reverse complementation and translation, as well as more advanced topics such as read trimming.We look at BioPython, which is an important library to learn about. It includes tools for storing and manipulating sequence information, performing alignments, running Basic Local Alignment Sequence Tool (BLAST), manipulating protein structures, and much more.Before we go any further, let’s get familiar with some of the core bioinformatics file formats:
Format | Description | Reference | Comments |
FAST-All (FASTA) | Plain text DNA or protein sequences | https://www.ncbi.nlm.nih.gov/genbank/fastaformat/ | Great for storing human-readable sequences |
FASTA with Quality score (FASTQ) | Like FASTA but includes quality scores as an additional line | https://learn.gencore.bio.nyu.edu/ngs-file... |