Search icon
Subscription
0
Cart icon
Close icon
You have no products in your basket yet
Save more on your purchases!
Savings automatically calculated. No voucher code required
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
The Applied Data Science Workshop - Second Edition

You're reading from  The Applied Data Science Workshop - Second Edition

Product type Book
Published in Jul 2020
Publisher Packt
ISBN-13 9781800202504
Pages 352 pages
Edition 2nd Edition
Languages
Author (1):
Alex Galea Alex Galea
Profile icon Alex Galea

Summary

In this chapter, we worked through the process of pulling tables from Wikipedia using web scraping techniques, cleaning up the resulting data with pandas, and producing a final analysis.

We started by looking at how HTTP requests work, focusing on GET requests and their response status codes. Then, we went into the Jupyter Notebook and made HTTP requests with Python using the requests library. We saw how Jupyter can be used to render HTML in the notebook, along with actual web pages that can be interacted with. In order to learn about web scraping, we saw how BeautifulSoup can be used to parse text from the HTML, and used this library to scrape tabular data from Wikipedia.

After pulling two tables of data, we processed them for analysis with pandas. The first table contained the central bank interest rates for each country, while the second table contained the populations. We combined these into a single table that was then used for the final analysis, which involved...

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}