Clarifying label placement with ggrepel
Bioinformatics datasets often have many thousands of data points. These can be genomic positions or genes within a genome, and as part of our data analysis, we will frequently want to label positions or genes so that the reader can identify them. A problem arises in that the labels can easily overlap or clash in the plots. The ggrepel package provides geoms for ggplot2 that allow for labels to be positioned much more clearly, incorporating label layout algorithms that make labels and connecting lines repel intelligently. In this recipe, we’ll look at the most important options for applying that to a genomics dataset.
Getting ready
We’ll need the ggplot2 and ggrepel packages and the fission yeast gene expression dataset in the rbioinfcookbook data package. This data frame contains yeast gene IDs in one column, the log 2-fold change of gene expression for that gene, and the p-value from a statistical test.