The number of users adopting the R programming language has been increasing faster and faster in the last few years. The functions of the R console are limited when it comes to managing a lot of files, or when we want to work with version control systems. This is the reason, in combination with the increasing adoption rate, why a need for a better development environment arose. To serve this need, a team of R fans began to develop an integrated development environment (IDE) to make it easier to work on bigger projects and to collaborate with others. This IDE has the name, RStudio.
In this article, we will see how to work with RStudio and projects
(For more resources related to this topic, see here.)
Working with RStudio and projects
In the times before RStudio, it was very hard to manage bigger projects with R in the R console, as you had to create all the folder structures on your own.
When you work with projects or open a project, RStudio will instantly take several actions. For example, it will start a new and clean R session, it will source the .Rprofile file in the project's main directory, and it will set the current working directory to the project directory. So, you have a complete working environment individually for every project. RStudio will even adjust its own settings, such as active tabs, splitter positions, and so on, to where they were when the project was closed.
But just because you can create projects with RStudio easily, it does not mean that you should create a project for every single time that you write R code. For example, if you just want to do a small analysis, we would recommend that you create a project where you save all your smaller scripts.
Creating a project with RStudio
RStudio offers you an easy way to create projects. Just navigate to File | New Project and you will see a popup window as with the options shown in the following screenshot:
These options let you decide from where you want to create your project. So, if you want to start it from scratch and create a new directory, associate your new project to an existing one, or if you want to create a project from a version control repository, you can avail of the respective options. For now, we will focus on creating a new directory.
The following screenshot shows you the next options available:
Locating your project
A very important question you have to ask yourself when creating a new project is where you want to save it? There are several options and details you have to pay attention to especially when it comes to collaboration and different people working on the same project.
You can save your project locally, on a cloud storage or with the help of a revision control system such as Git.
Creating your first project
To begin your first project, choose the New Directory option we described before and create an empty project. Then, choose a name for the directory and the location that you want to save it in. You should create a projects folder on your Dropbox.
The first project will be a small data analysis based on a dataset that was extracted from the 1974 issue of the Motor Trend US magazine. It comprises fuel consumption and ten aspects of automobile design and performance, such as the weight or number of cylinders for 32 automobiles, and is included in the base R package. So, we do not have to install a separate package to work with this dataset, as it is automatically loaded when you start R:
As you can see, we left the Use packrat with this project option unchecked. Packrat is a dependency management tool that makes your R code more isolated, portable, and reproducible by giving your project its own privately managed package library. This is especially important when you want to create projects in an organizational context where the code has to run on various computer systems, and has to be usable for a lot of different users. This first project will just run locally and will not focus on a specific combination of package versions.
Organizing your folders
RStudio creates an empty directory for you that includes just the file, Motor-Car-Trend-Analysis.Rproj. This file will store all the information on your project that RStudio will need for loading. But to stay organized, we have to create some folders in the directory. Create the following folders:
- data: This includes all the data that we need for our analysis
- code: This includes all the code files for cleaning up data, generating plots, and so on
- plots: This includes all graphical outputs
- reports: This comprises all the reports that we create from our dataset
Saving the data
The Motor Trend Car Road Tests dataset is part of the dataset package, which is one of the preinstalled packages in R. But, we will save the data in a CSV file in our data folder, after extracting the data from the mtcars variable, to make sure our analysis is reproducible.
Put the following line of code in a new R script and save it as data.R in the code folder:
#write data into csv file write.csv(mtcars, file = "data/cars.csv", row.names=FALSE)
Analyzing the data
The analysis script will first have to load the data from the CSV file with the following line:
cars_data <- read.csv(file = "data/cars.csv", header = TRUE, sep = ",")
To learn more about RStudio, the following books published by Packt Publishing (https://www.packtpub.com/) are recommended:
- Mastering Machine Learning with R (https://www.packtpub.com/big-data-and-business-intelligence/mastering-machine-learning-r)
- R Data Analysis Cookbook (https://www.packtpub.com/big-data-and-business-intelligence/r-data-analysis-cookbook)
- Mastering Data Analysis with R (https://www.packtpub.com/big-data-and-business-intelligence/mastering-data-analysis-r)