Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Hands-On Web Scraping with Python - Second Edition

You're reading from  Hands-On Web Scraping with Python - Second Edition

Product type Book
Published in Oct 2023
Publisher Packt
ISBN-13 9781837636211
Pages 324 pages
Edition 2nd Edition
Languages
Author (1):
Anish Chapagain Anish Chapagain
Profile icon Anish Chapagain

Table of Contents (20) Chapters

Preface 1. Part 1:Python and Web Scraping
2. Chapter 1: Web Scraping Fundamentals 3. Chapter 2: Python Programming for Data and Web 4. Part 2:Beginning Web Scraping
5. Chapter 3: Searching and Processing Web Documents 6. Chapter 4: Scraping Using PyQuery, a jQuery-Like Library for Python 7. Chapter 5: Scraping the Web with Scrapy and Beautiful Soup 8. Part 3:Advanced Scraping Concepts
9. Chapter 6: Working with the Secure Web 10. Chapter 7: Data Extraction Using Web APIs 11. Chapter 8: Using Selenium to Scrape the Web 12. Chapter 9: Using Regular Expressions and PDFs 13. Part 4:Advanced Data-Related Concepts
14. Chapter 10: Data Mining, Analysis, and Visualization 15. Chapter 11: Machine Learning and Web Scraping 16. Part 5:Conclusion
17. Chapter 12: After Scraping – Next Steps and Data Analysis 18. Index 19. Other Books You May Enjoy

Conventions used

There are a number of text conventions used throughout this book.

Code in text: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: “Here, we have expressed our interest in only finding the information from the content attribute with the <meta> tag, which has the name attribute with the keywords and description values, respectively.”

A block of code is set as follows:

source.find(‘a:contains(“Web”)’) # [<a.menuitm>, <a>, <a>, <a>, <a>]
source.find(‘a:contains(“Web”):last’).text()
# ‘Web Scraper’
source.find(‘a:contains(“Web”):last’).attr(‘href’) # ‘#’

Any command-line input or output is written as follows:

(secondEd) C:\HOWScraping2E> pip install jupyterlab

Outputs and comments inside blocks of code look as follows:

# [<a.menuitm>, <a>, <a>, <a>, <a>]
# ‘Web Scraper’
# ‘#’

Bold: Indicates a new term, an important word, or words that you see onscreen. For instance, words in menus or dialog boxes appear in bold. Here is an example: “In addition, we will create separate files for author and quotes.”

Tips or important notes

Appear like this.

lock icon The rest of the chapter is locked
Next Chapter arrow right
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}