You're reading from R Bioinformatics Cookbook - Second Edition

Product typeBook

Published inOct 2023

PublisherPackt

ISBN-139781837634279

Edition2nd Edition

Concepts

Bioinformatics

Author (1)

Dan MacLean

Using datapasta to create R objects from cut-and-paste data

Being able to paste data into source code documents is useful for all sorts of reasons, not least because it allows for a reproducible example, also known as a reprex—a minimal, self-containedexample that demonstrates a problem or a concept. By including data in the source code, others can run the code and see the results for themselves, making it easier to understand and replicate the results.

The R datapasta package makes it easy to paste data into R source code documents. It provides a set of functions for converting data to and from R definitions and is extremely useful when creating static data objects in code examples, tests, or when sharing. In this recipe, you will learn how to use datapasta to bring external data into your source code documents by typing them in longhand.

Getting ready

We will use the datapasta package, though installing it is non-standard. Use this command:

renv::install("datapasta", repos = c(mm = "https://milesmcbain.r-universe.dev", getOption("repos")))

This should install the package using renv and make us ready to go. Remember to install renv the usual way if you don’t already have it.

How to do it…

The datapasta tool is implemented as an add-in for RStudio, so we begin by setting that up:

Use the RStudio Tools | Addins | Browse Addins menu and then Keyboard Shortcuts. You get to choose which key combination you want to use for pasting.
Click the middle column next to the operations, as shown in the following screenshot, and press the keys you want to use. The combination in Figure 2.1 is a good choice:

Figure 2.1 – Selecting key shortcuts

Get a web table—for example, go to this page on Wikipedia: https://en.wikipedia.org/wiki/Tab-separated_values—and copy the whole text-based example table using the browser’s right-click Copy feature. This should put the text table in your copy/paste buffer. It should look like what’s shown in Figure 2.2:

Figure 2.2 – Web data on Wikipedia

Paste the table now in your copy/paste buffer into an R source document. Place the typing cursor at a suitable place in the source R document you’re working in and use the key combo to paste in the table.

So, with the preceding setup, we should be able to quickly take data from varied sources and coerce them into R objects for analysis.

How it works…

The datapasta package is really useful. The first step sets up our preferred paste keys for later use; the second step is simple, and we just go somewhere and find some data in the world we would like in our R source; and by the third step, we’re selecting the place to put the data definition and pasting it in. Our example goes from a table in a web page to this definition for an R object:

data.frame(  stringsAsFactors = FALSE,
      Sepal.length = c(5.1, 4.9, 4.7, 4.6, 5),
       Sepal.width = c(3.5, 3, 3.2, 3.1, 3.6),
      Petal.length = c(1.4, 1.4, 1.3, 1.5, 1.4),
       Petal.width = c(0.2, 0.2, 0.2, 0.2, 0.2),
           Species = c("I. setosa","I. setosa",
                       "I. setosa","I. setosa","I. setosa")
)

And that powerful little operation is how we can convert web data to source code very easily.

There’s more…

It’s possible to go the other way around, from some R object to a definition of that object using the dpasta() function, which can coerce data frames, tibbles, and vectors into definitions. This is really useful for reproducible examples. This example shows how to actually build the object with code so that we don’t have to share the file and everything is in one source document:

library(datapasta)mtcars |> 
   dpasta()

This shows how the datapasta package is a super-useful tool for turning objects from the web and within R into code.

You have been reading a chapter from

R Bioinformatics Cookbook - Second Edition

Published in: Oct 2023Publisher: PacktISBN-13: 9781837634279

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Dan MacLean

Professor Dan MacLean has a PhD in molecular biology from the University of Cambridge and gained postdoctoral experience in genomics and bioinformatics at Stanford University in California. Dan is now an honorary professor at the School of Computing Sciences at the University of East Anglia. He has worked in bioinformatics and plant pathogenomics, specializing in R and Bioconductor, and has developed analytical workflows in bioinformatics, genomics, genetics, image analysis, and proteomics at the Sainsbury Laboratory since 2006. Dan has developed and published software packages in R, Ruby, and Python, with over 100,000 downloads combined.
Read more about Dan MacLean

Personalised recommendations for you

Based on your interests and search pattern

Engineering Manager's Handbook

Engineering Manager's Handbook is a comprehensive guide for managers to excel in their role, foster customer-centric digital products, learn leadership, team building, and balancing technical work with management. You’ll also explore how to develop trust, authority, and collaboration to drive success and make a lasting impact.

BookSep 2023278 pages

C++ Game Animation Programming

Video game characters have a fascinating history, evolving from simple 2D sprites to high-polygon 3D models. Take a look behind the curtain and learn how to build a 3D renderer, load character models, play animations and blend between them, and create large crowds of animated people with this comprehensive C++ game animation programming guide.

BookDec 2023480 pages

Gamification for Product Excellence

This book helps you to take your product management strategy to the next level by standing out in crowded markets. Along with boosting user adoption rates by creating engaging products that incorporate playful elements, learn gamification theory and how to integrate it into your design, product development, and product management processes.

BookSep 2023350 pages

Supercharging Productivity with Trello

Supercharging Productivity with Trello is the ultimate guide for anyone looking to boost their productivity with digital tools. Whether you're new to Trello or a seasoned professional, this book covers everything from core features to advanced automation, and Power-Ups.

BookAug 2023342 pages

Automate It with Zapier and Generative AI

This comprehensive guide takes you through the concepts of business process automation, showing you how Zapier can facilitate it without having to write code and helping you to boost productivity. You’ll learn how to save time, reduce costs, and make your business recession-proof by using Zapier to automate tasks in your cloud-based business apps.

BookAug 2023706 pages

Scoring to Picture in Logic Pro

In this book, you’ll explore a variety of techniques to synchronize music to picture using Logic Pro. Though this is not a technical manual, it will teach you how to make the best use of Logic Pro and how to wield this technology to maximize your potential when scoring to picture.

BookSep 2023412 pages

Mastering Information Security Compliance Management

This concise book equips you with the knowledge and practices needed to establish and maintain an effective information security management system. The chapters provide insights into ISO/IEC 27001/27002:2022, risk management, ISMS development, incident management, audit processes, and strategies for continuous improvement.

BookAug 2023236 pages1

Implementing Atlassian Confluence

Implementing Atlassian Confluence provides both a high-level overview and an insightful path for remote collaboration with Atlassian Confluence. With this multi-layered yet practical guide, you’ll be able to set up Confluence-based collaboration with minimum external consultancy services to ensure smooth and close coordination between teams.

BookSep 2023406 pages

R Bioinformatics Cookbook

This book takes a unique problem–solution approach to handling complex tasks in the bioinformatics domain using different datasets present in the book. With the help of real-world examples, you’ll learn to put each independent recipe to use to tackle problems in the field of bioinformatics.

BookOct 2023396 pages

Build Your Own Metaverse with Unity

Build Your own Metaverse with Unity is a practical guide for developers to create their own metaverse - a virtual world with infinite possibilities. It empowers you to identify gaps in existing metaverses and improve upon them, enabling you to shape your virtual world.

BookSep 2023586 pages5