Reader small image

You're reading from  R Bioinformatics Cookbook - Second Edition

Product typeBook
Published inOct 2023
PublisherPackt
ISBN-139781837634279
Edition2nd Edition
Right arrow
Author (1)
Dan MacLean
Dan MacLean
author image
Dan MacLean

Professor Dan MacLean has a PhD in molecular biology from the University of Cambridge and gained postdoctoral experience in genomics and bioinformatics at Stanford University in California. Dan is now an honorary professor at the School of Computing Sciences at the University of East Anglia. He has worked in bioinformatics and plant pathogenomics, specializing in R and Bioconductor, and has developed analytical workflows in bioinformatics, genomics, genetics, image analysis, and proteomics at the Sainsbury Laboratory since 2006. Dan has developed and published software packages in R, Ruby, and Python, with over 100,000 downloads combined.
Read more about Dan MacLean

Right arrow

Finding DNA motifs with universalmotif

A very common task when working with DNA sequences is finding instances of motifs – a short-defined sequence – within a longer sequence. These could represent protein-DNA binding sites such as transcription factor binding sites in a gene promoter or enhancer region. There are two starting points for this analysis – you either have a database of motifs that you want to use to scan target DNA sequences and extract wherever the motif occurs or you have just the sequences of interest and you want to find out whether there are any repeating motifs in there. We’ll look at ways of doing both of these things in this recipe.

Getting ready

For this recipe, we need a matrix describing the motif (a position-specific weight matrix or PWSM) and a set of sequences from upstream of transcriptional start sites. These are provided in the rbioinfcookbok package. We’ll use the universalmotif package to work with motifs and...

lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
R Bioinformatics Cookbook - Second Edition
Published in: Oct 2023Publisher: PacktISBN-13: 9781837634279

Author (1)

author image
Dan MacLean

Professor Dan MacLean has a PhD in molecular biology from the University of Cambridge and gained postdoctoral experience in genomics and bioinformatics at Stanford University in California. Dan is now an honorary professor at the School of Computing Sciences at the University of East Anglia. He has worked in bioinformatics and plant pathogenomics, specializing in R and Bioconductor, and has developed analytical workflows in bioinformatics, genomics, genetics, image analysis, and proteomics at the Sainsbury Laboratory since 2006. Dan has developed and published software packages in R, Ruby, and Python, with over 100,000 downloads combined.
Read more about Dan MacLean