Reader small image

You're reading from  R for Data Science Cookbook (n)

Product typeBook
Published inJul 2016
Reading LevelIntermediate
Publisher
ISBN-139781784390815
Edition1st Edition
Languages
Tools
Concepts
Right arrow
Author (1)
Yu-Wei, Chiu (David Chiu)
Yu-Wei, Chiu (David Chiu)
author image
Yu-Wei, Chiu (David Chiu)

Yu-Wei, Chiu (David Chiu) is the founder of LargitData (www.LargitData.com), a startup company that mainly focuses on providing big data and machine learning products. He has previously worked for Trend Micro as a software engineer, where he was responsible for building big data platforms for business intelligence and customer relationship management systems. In addition to being a start-up entrepreneur and data scientist, he specializes in using Spark and Hadoop to process big data and apply data mining techniques for data analysis. Yu-Wei is also a professional lecturer and has delivered lectures on big data and machine learning in R and Python, and given tech talks at a variety of conferences. In 2015, Yu-Wei wrote Machine Learning with R Cookbook, Packt Publishing. In 2013, Yu-Wei reviewed Bioinformatics with R Cookbook, Packt Publishing. For more information, please visit his personal website at www.ywchiu.com. **********************************Acknowledgement************************************** I have immense gratitude for my family and friends for supporting and encouraging me to complete this book. I would like to sincerely thank my mother, Ming-Yang Huang (Miranda Huang); my mentor, Man-Kwan Shan; the proofreader of this book, Brendan Fisher; Members of LargitData; Data Science Program (DSP); and other friends who have offered their support.
Read more about Yu-Wei, Chiu (David Chiu)

Right arrow

Scraping web data


In most cases, the majority of data will not exist in your database, but will instead be published in different forms on the Internet. To dig up more valuable information from these data sources, we need to know how to access and scrape data from the Web. Here, we will illustrate how to use the rvest package to harvest finance data from http://www.bloomberg.com/.

Getting ready

In this recipe, you need to prepare your environment with R installed and a computer that can access the Internet.

How to do it…

Perform the following steps to scrape data from http://www.bloomberg.com/:

  1. First, access the following link to browse the S&P 500 index on the Bloomberg Business websitehttp://www.bloomberg.com/quote/SPX:IND:

    Figure 9: S&P 500 index

  2. Once the page appears, as shown in the preceding screenshot, we can begin installing and loading the rvest package:

    >  install.packages("rvest")
    >  library(rvest)
    
  3. Next, you can use the HTML function from the rvest package to scrape and...

lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
R for Data Science Cookbook (n)
Published in: Jul 2016Publisher: ISBN-13: 9781784390815

Author (1)

author image
Yu-Wei, Chiu (David Chiu)

Yu-Wei, Chiu (David Chiu) is the founder of LargitData (www.LargitData.com), a startup company that mainly focuses on providing big data and machine learning products. He has previously worked for Trend Micro as a software engineer, where he was responsible for building big data platforms for business intelligence and customer relationship management systems. In addition to being a start-up entrepreneur and data scientist, he specializes in using Spark and Hadoop to process big data and apply data mining techniques for data analysis. Yu-Wei is also a professional lecturer and has delivered lectures on big data and machine learning in R and Python, and given tech talks at a variety of conferences. In 2015, Yu-Wei wrote Machine Learning with R Cookbook, Packt Publishing. In 2013, Yu-Wei reviewed Bioinformatics with R Cookbook, Packt Publishing. For more information, please visit his personal website at www.ywchiu.com. **********************************Acknowledgement************************************** I have immense gratitude for my family and friends for supporting and encouraging me to complete this book. I would like to sincerely thank my mother, Ming-Yang Huang (Miranda Huang); my mentor, Man-Kwan Shan; the proofreader of this book, Brendan Fisher; Members of LargitData; Data Science Program (DSP); and other friends who have offered their support.
Read more about Yu-Wei, Chiu (David Chiu)