Reader small image

You're reading from  Extending Excel with Python and R

Product typeBook
Published inApr 2024
PublisherPackt
ISBN-139781804610695
Edition1st Edition
Right arrow
Authors (2):
Steven Sanderson
Steven Sanderson
author image
Steven Sanderson

Steven Sanderson, MPH, is an applications manager for the patient accounts department at Stony Brook Medicine. He received his bachelor's degree in economics and his master's in public health from Stony Brook University. He has worked in healthcare in some capacity for just shy of 20 years. He is the author and maintainer of the healthyverse set of R packages. He likes to read material related to social and labor economics and has recently turned his efforts back to his guitar with the hope that his kids will follow suit as a hobby they can enjoy together.
Read more about Steven Sanderson

David Kun
David Kun
author image
David Kun

David Kun is a mathematician and actuary who has always worked in the gray zone between quantitative teams and ICT, aiming to build a bridge. He is a co-founder and director of Functional Analytics and the creator of the ownR Infinity platform. As a data scientist, he also uses ownR for his daily work. His projects include time series analysis for demand forecasting, computer vision for design automation, and visualization.
Read more about David Kun

View More author details
Right arrow

Working with R packages for Excel manipulation

There are several packages available both on CRAN and on GitHub that allow for reading and manipulation of Excel files. In this section, we are specifically going to focus on the packages: readxl, openxlsx, and xlsx to read Excel files. These three packages all have their own functions to read Excel files. These functions are as follows:

  • readxl::read_excel()
  • openxlsx::read.xlsx()
  • xlsx::read.xlsx()

Each function has a set of parameters and conventions to follow. Since readxl is part of the tidyverse collection of packages, it follows its conventions and returns a tibble object upon reading the file. If you do not know what a tibble is, it is a modern version of R’s data.frame, a sort of spreadsheet in the R environment. It is the building block of most analyses. Moving on to openxlsx and xlsx, they both return a base R data.frame object, with the latter also able to return a list object. If you are wondering how this relates to manipulating an actual Excel file, I can explain. First, to manipulate something in R, the data must be in the R environment, so you cannot manipulate the file unless the data is read in. These packages have different functions for manipulating Excel or reading data in certain ways that allow for further analysis and or manipulation. It is important to note that xlsx does require Java to be installed.

As we transition from our exploration of R packages for Excel manipulation, we’ll turn our attention to the crucial task of effectively reading Excel files into R, thereby unlocking even more possibilities for data analysis and manipulation.

Previous PageNext Page
You have been reading a chapter from
Extending Excel with Python and R
Published in: Apr 2024Publisher: PacktISBN-13: 9781804610695
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (2)

author image
Steven Sanderson

Steven Sanderson, MPH, is an applications manager for the patient accounts department at Stony Brook Medicine. He received his bachelor's degree in economics and his master's in public health from Stony Brook University. He has worked in healthcare in some capacity for just shy of 20 years. He is the author and maintainer of the healthyverse set of R packages. He likes to read material related to social and labor economics and has recently turned his efforts back to his guitar with the hope that his kids will follow suit as a hobby they can enjoy together.
Read more about Steven Sanderson

author image
David Kun

David Kun is a mathematician and actuary who has always worked in the gray zone between quantitative teams and ICT, aiming to build a bridge. He is a co-founder and director of Functional Analytics and the creator of the ownR Infinity platform. As a data scientist, he also uses ownR for his daily work. His projects include time series analysis for demand forecasting, computer vision for design automation, and visualization.
Read more about David Kun