Reader small image

You're reading from  Python Web Scraping. - Second Edition

Product typeBook
Published inMay 2017
Reading LevelIntermediate
Publisher
ISBN-139781786462589
Edition2nd Edition
Languages
Concepts
Right arrow
Author (1)
Katharine Jarmul
Katharine Jarmul
author image
Katharine Jarmul

Katharine Jarmul is a data scientist and Pythonista based in Berlin, Germany. She runs a data science consulting company, Kjamistan, that provides services such as data extraction, acquisition, and modelling for small and large companies. She has been writing Python since 2008 and scraping the web with Python since 2010, and has worked at both small and large start-ups who use web scraping for data analysis and machine learning. When she's not scraping the web, you can follow her thoughts and activities via Twitter (@kjam)
Read more about Katharine Jarmul

Right arrow

Reverse engineering a dynamic web page

So far, we tried to scrape data from a web page the same way as introduced in Chapter 2, Scraping the Data. This method did not work because the data is loaded dynamically using JavaScript. To scrape this data, we need to understand how the web page loads the data, a process which can be described as reverse engineering. Continuing the example from the preceding section, in our browser tools, if we click on the Network tab and then perform a search, we will see all of the requests made for a given page. There are a lot! If we scroll up through the requests, we see mainly photos (from loading country flags), and then we notice one with an interesting name: search.json with a path of /ajax:

If we click on that URL using Chrome, we can see more details (there is similar functionality for this in all major browsers, so your view may vary; however the main features should function...

lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Python Web Scraping. - Second Edition
Published in: May 2017Publisher: ISBN-13: 9781786462589

Author (1)

author image
Katharine Jarmul

Katharine Jarmul is a data scientist and Pythonista based in Berlin, Germany. She runs a data science consulting company, Kjamistan, that provides services such as data extraction, acquisition, and modelling for small and large companies. She has been writing Python since 2008 and scraping the web with Python since 2010, and has worked at both small and large start-ups who use web scraping for data analysis and machine learning. When she's not scraping the web, you can follow her thoughts and activities via Twitter (@kjam)
Read more about Katharine Jarmul