Reader small image

You're reading from  Learning R Programming

Product typeBook
Published inOct 2016
Reading LevelBeginner
PublisherPackt
ISBN-139781785889776
Edition1st Edition
Languages
Tools
Right arrow
Author (1)
Kun Ren
Kun Ren
author image
Kun Ren

Kun Ren has used R for nearly 4 years in quantitative trading, along with C++ and C#, and he has worked very intensively (more than 8-10 hours every day) on useful R packages that the community does not offer yet. He contributes to packages developed by other authors and reports issues to make things work better. He is also a frequent speaker at R conferences in China and has given multiple talks. Kun also has a great social media presence. Additionally, he has substantially contributed to various projects, which is evident from his GitHub account: https://github.com/renkun-ken https://cn.linkedin.com/in/kun-ren-76027530 http://renkun.me/ http://renkun.me/formattable/ http://renkun.me/pipeR/ http://renkun.me/rlist/
Read more about Kun Ren

Right arrow

Chapter 11. Working with Databases

In the previous chapter, you learned the basic concepts of object-oriented programming. These include class and methods, and how they are connected by generic functions in R through method dispatch. You learned about the basic usage of S3, S4, RC, and R6, including defining classes and generic functions as well as implementing methods for certain classes.

Now that we have covered most of the important features of R, it is time we go ahead and discuss more practical topics. In this chapter, we will begin the discussion with how R can be used to work with databases, which is perhaps the first step of many data-analysis projects: extracting data from a database. More specifically, we will cover the following topics:

  • Understanding relational databases

  • Using SQL to query relational databases such as SQLite and MySQL

  • Working with NoSQL databases such as MongoDB and Redis

Working with relational databases


In the previous chapters, we used a family of built-in functions such as read.csv and read.table to import data from separator-delimited files, such as those in the csv format. Using text formats to store data is handy and portable. When the data file is large, however, such a storage method may not be the best way.

There are three main reasons why text formats can no longer be easy to use. They are as follows:

  1. Functions such as read.csv() are mostly used to load the whole file into memory, that is, a data frame in R. If the data is too large to fit into the computer memory, we simply cannot do it.

  2. Even if the dataset is large, we usually don't have to load the whole dataset into memory when we work on a task. Instead, we often need to extract a subset of the dataset that meets a certain condition. The built-in data-importer functions simply do not support querying a csv file.

  3. The dataset is still updating, that is, we need to insert records into the dataset...

Working with NoSQL databases


In the previous section of this chapter, you learned the basics of relational databases and how to use SQL to query data. Relational data is mostly organized in a tabular form, that is, as a collection of tables with relations.

However, when the volume of data exceeds the capacity of a server, problems occur because the traditional model of relational databases does not easily support horizontal scalability, that is, storing data in a cluster of servers instead of a single one. This adds a new layer of complexibility of database management as the data is stored in a distributed form while still accessible as one logical database. 

In recent years, NoSQL, or non-relational databases, have become much more popular than before due to the introduction of new database models and the remarkable performance they exhibit in big data analytics and real-time applications. Some non-relational databases are designed for high availability, scalability, and flexibility, and...

Summary


In this chapter, you learned how to access different types of databases from R. We introduced the basic usage of relational databases such as SQLite and non-relational databases such as MongoDB and Redis. With the understanding of major differences in their functionality and feature sets, we need to choose an appropriate database to work with in our projects according to our purpose and needs.

In many data-related projects, data storage and data importing are the initial steps, but data cleaning and data manipulation cost most of the time. In the next chapter, we will move on to data-manipulation techniques. You will learn about a number of packages that are specially tailored for handy but powerful data manipulation. To better work with these packages, we'll need a better understanding of how they work, which requires the sound knowledge introduced in the previous chapters.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Learning R Programming
Published in: Oct 2016Publisher: PacktISBN-13: 9781785889776
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Kun Ren

Kun Ren has used R for nearly 4 years in quantitative trading, along with C++ and C#, and he has worked very intensively (more than 8-10 hours every day) on useful R packages that the community does not offer yet. He contributes to packages developed by other authors and reports issues to make things work better. He is also a frequent speaker at R conferences in China and has given multiple talks. Kun also has a great social media presence. Additionally, he has substantially contributed to various projects, which is evident from his GitHub account: https://github.com/renkun-ken https://cn.linkedin.com/in/kun-ren-76027530 http://renkun.me/ http://renkun.me/formattable/ http://renkun.me/pipeR/ http://renkun.me/rlist/
Read more about Kun Ren