Aligning genetic and genomic data
Before we can perform any phylogenetic analysis, we need to align our genetic and genomic data. Here, we will use MAFFT (http://mafft.cbrc.jp/alignment/software/) to perform the genome analysis. The gene analysis will be performed using MUSCLE (http://www.drive5.com/muscle/).Alignment is a key step in any phylogenetic analysis. When we align whole genomes, we are aligning the entire nucleotide sequences of the genomes against each other. When we align genes or genetic content of an organism, we are aligning one more more genes against each other from a gene family. This could potentially be done either at the nucleotide or amino acid (protein level). There are even Structure-Guided alignments which take place in 3-D protein space (see: Ghaly et al, “EcoFoldDB: Protein-structure guided funcitonal profiling of ecologically relevant microbial traits at the genome scale”, bioRxiv, Apr 2025 - https://www.biorxiv.org/content/10.1101/2025.04.02...