Reader small image

You're reading from  Practical Predictive Analytics

Product typeBook
Published inJun 2017
Reading LevelIntermediate
PublisherPackt
ISBN-139781785886188
Edition1st Edition
Languages
Tools
Right arrow
Author (1)
Ralph Winters
Ralph Winters
author image
Ralph Winters

Ralph Winters started his career as a database researcher for a music performing rights organization (he composed as well!), and then branched out into healthcare survey research, finally landing in the Analytics and Information technology world. He has provided his statistical and analytics expertise to many large fortune 500 companies in the financial, direct marketing, insurance, healthcare, and pharmaceutical industries. He has worked on many diverse types of predictive analytics projects involving customerretention, anti-money laundering, voice of the customer text mining analytics, and health care risk and customer choice models. He is currently data architect for a healthcare services company working in the data and advanced analytics group. He enjoys working collaboratively with a smart team of business analysts, technologists, actuaries as well as with other data scientists. Ralph considered himself a practical person. In addition to authoring Practical Predictive Analytics for Packt Publishing, he has also contributed two tutorials illustrating the use of predictive analytics in Medicine and Healthcare in Practical Predictive Analytics and Decisioning Systems for Medicine: Miner et al., Elsevier September, 2014, and also presented Practical Text Mining with SQL using Relational Databases, at the 2013 11th Annual Text and Social Analytics Summit in Cambridge, MA. Ralph resides in New Jersey with his loving wife Katherine, amazing daughters Claire and Anna, and his four-legged friends, Bubba and Phoebe, who can be unpredictable. Ralph's web site can be found at ralphwinters.com
Read more about Ralph Winters

Right arrow

R


Most of the code examples in this book are written in R. As a prerequisite to this book, it is presumed that you will have some basic R knowledge, as well as some exposure to statistics. If you already know about R, you may skip this section, but I wanted to discuss it a little bit for completeness.

The R language is derived from the S language which was developed in the 1970s. However, the R language has grown beyond the original core packages to become an extremely viable environment for predictive analytics.

Although R was developed by statisticians for statisticians, it has come a long way since its early days. The strength of R comes from its package system, which allows specialized or enhanced functionality to be developed and linked to the core system.

Although the original R system was sufficient for statistics and data mining, an important goal of R was to have its system enhanced via user-written contributed packages. At the time of writing, the R system contains more than 10,000 packages. Some are of excellent quality, and some are of dubious quality. Therefore, the goal is to find the truly useful packages that add the most value.

Most, if not all, R packages in use address most common predictive analytics tasks that you will encounter. If you come across a task that does not fit into any category, the chances are good that someone in the R community has done something similar. And of course, there is always a chance that someone is developing a package to do exactly what you want it to do. That person could be eventually be you!

CRAN

The Comprehensive R Archive Network (CRAN) is a go-to site which aggregates R distributions, binaries, documentation, and packages. To get a sense of the kind of packages that could be valuable to you, check out the Task Views section maintained by CRAN here:

https://cran.r-project.org/web/views/

R installation

R installation is typically done by downloading the software directly from the Comprehensive R Archive Network (CRAN) site:

  1. Navigate to https://cran.r-project.org/.
  2. Install the version of R appropriate for your operating system. Please read any notes regarding downloading specific versions. For example, if you are a Mac user may need to have XQuartz installed in addition to R, so that some graphics can render correctly.

Alternate ways of exploring R

Although installing R directly from the CRAN site is the way most people will proceed, I wanted to mention some alternative R installation methods. These methods are often good in instances when you are not always at your computer:

  • Virtual environment: Here are a few ways to install R in the virtual environment:
    • VirtualBox or VMware: Virtual environments are good for setting up protected environments and loading preinstalled operating systems and packages. Some advantages are that they are good for isolating testing areas, and when you do not wish to take up additional space on your own machine.
    • Docker: Docker resembles a virtual machine, but is a bit more lightweight since it does not emulate an entire operating system, but emulates only the needed processes.
  • Cloud-based: Here are a few methods to install R in the cloud-based environment. Cloud based environments as perfect for working in situations when you are not working directly on your computer:
    • AWS/Azure: These are three environments which are very popular. Reasons for using cloud based environments are similar to the reasons given for virtual environment, but also have some additional advantages: such as the additional capability to run with very large datasets and with more memory. All of the previously mentioned require a subscription service to use, however free tiers are offered to get started. We will explore Databricks in depth in later chapters, when we learn about predictive analytics using R and SparkR
  • Web-based: Web-based platforms are good for learning R and for trying out quick programs and analysis. R-Fiddle is a good choice, however there are other including: R-Web, Jupyter, Tutorialspoint, and Anaconda Cloud.
  • Command line: R can be run purely from a command line. When R is run this way, it is usually coupled with other Linux tools such as curl, grep, awk, and various customized text editors, such as Emacs Speaks Statistics (ESS). Often R is run this way in production mode, when processes need to be automated and scheduled directly via the operating system
Previous PageNext Page
You have been reading a chapter from
Practical Predictive Analytics
Published in: Jun 2017Publisher: PacktISBN-13: 9781785886188
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Ralph Winters

Ralph Winters started his career as a database researcher for a music performing rights organization (he composed as well!), and then branched out into healthcare survey research, finally landing in the Analytics and Information technology world. He has provided his statistical and analytics expertise to many large fortune 500 companies in the financial, direct marketing, insurance, healthcare, and pharmaceutical industries. He has worked on many diverse types of predictive analytics projects involving customerretention, anti-money laundering, voice of the customer text mining analytics, and health care risk and customer choice models. He is currently data architect for a healthcare services company working in the data and advanced analytics group. He enjoys working collaboratively with a smart team of business analysts, technologists, actuaries as well as with other data scientists. Ralph considered himself a practical person. In addition to authoring Practical Predictive Analytics for Packt Publishing, he has also contributed two tutorials illustrating the use of predictive analytics in Medicine and Healthcare in Practical Predictive Analytics and Decisioning Systems for Medicine: Miner et al., Elsevier September, 2014, and also presented Practical Text Mining with SQL using Relational Databases, at the 2013 11th Annual Text and Social Analytics Summit in Cambridge, MA. Ralph resides in New Jersey with his loving wife Katherine, amazing daughters Claire and Anna, and his four-legged friends, Bubba and Phoebe, who can be unpredictable. Ralph's web site can be found at ralphwinters.com
Read more about Ralph Winters