Getting Started with Beautiful Soup

Learn how to extract information from websites using Beautiful Soup and the Python urllib2 module. This practical, hands-on guide covers everything you need to know to get a head start in website scraping.
Preview in Mapt

Getting Started with Beautiful Soup

Vineeth G. Nair

Learn how to extract information from websites using Beautiful Soup and the Python urllib2 module. This practical, hands-on guide covers everything you need to know to get a head start in website scraping.
Mapt Subscription
FREE
$29.99/m after trial
eBook
$14.70
RRP $20.99
Save 29%
Print + eBook
$34.99
RRP $34.99
What do I get with a Mapt Pro subscription?
  • Unlimited access to all Packt’s 5,000+ eBooks and Videos
  • Early Access content, Progress Tracking, and Assessments
  • 1 Free eBook or Video to download and keep every month after trial
What do I get with an eBook?
  • Download this book in EPUB, PDF, MOBI formats
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
What do I get with Print & eBook?
  • Get a paperback copy of the book delivered to you
  • Download this book in EPUB, PDF, MOBI formats
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
What do I get with a Video?
  • Download this Video course in MP4 format
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
$0.00
$14.70
$34.99
$29.99 p/m after trial
RRP $20.99
RRP $34.99
Subscription
eBook
Print + eBook
Start 30 Day Trial

Frequently bought together


Getting Started with Beautiful Soup Book Cover
Getting Started with Beautiful Soup
$ 20.99
$ 14.70
Getting Started with Deep Learning with R [Integrated Course] Book Cover
Getting Started with Deep Learning with R [Integrated Course]
$ 124.99
$ 106.25
Buy 2 for $32.20
Save $113.78
Add to Cart

Book Details

ISBN 139781783289554
Paperback130 pages

Book Description

Beautiful Soup is a Python library designed for quick turnaround projects like screen-scraping. Beautiful Soup provides a few simple methods and Pythonic idioms for navigating, searching, and modifying a parse tree: a toolkit for dissecting a document and extracting what you need without writing excess code for an application. It doesn't take much code to write an application using Beautiful Soup.

Getting Started with Beautiful Soup is a practical guide to Beautiful Soup using Python. The book starts by walking you through the installation of each and every feature of Beautiful Soup using simple examples which include sample Python codes as well as diagrams and screenshots wherever required for better understanding. The book discusses the problems of how exactly you can get data out of a website and provides an easy solution with the help of a real website and sample code.

Getting Started with Beautiful Soup goes over the different methods to install Beautiful Soup in both Linux and Windows systems. You will then learn about searching, navigating, content modification, encoding support, and output formatting with the help of examples and sample Python codes for each example so that you can try them out to get a better understanding. This book is a practical guide for scraping information from any website. If you want to learn how to efficiently scrape pages from websites, then this book is for you.

Table of Contents

Chapter 1: Installing Beautiful Soup
Installing Beautiful Soup
Using Beautiful Soup without installation
Verifying the installation
Quick reference
Summary
Chapter 2: Creating a BeautifulSoup Object
Creating a BeautifulSoup object
Tag
The NavigableString object
Quick reference
Summary
Chapter 3: Search Using Beautiful Soup
Searching in Beautiful Soup
Using search methods to scrape information from a web page
Quick reference
Summary
Chapter 4: Navigation Using Beautiful Soup
Navigation using Beautiful Soup
Quick reference
Summary
Chapter 5: Modifying Content Using Beautiful Soup
Modifying Tag using Beautiful Soup
Modifying string contents
Deleting tags from the HTML document
Special functions to modify content
Quick reference
Summary
Chapter 6: Encoding Support in Beautiful Soup
Encoding in Beautiful Soup
Output encoding
Quick reference
Summary
Chapter 7: Output in Beautiful Soup
Formatted printing
Unformatted printing
Output formatters in Beautiful Soup
Using get_text()
Quick reference
Summary
Chapter 8: Creating a Web Scraper
Getting book details from PacktPub.com
Getting selling prices from Amazon
Getting the selling price from Barnes and Noble
Summary

What You Will Learn

  • Learn how to scrape HTML pages from websites
  • Implement a simple method to scrape any website with the help of developer tools, the Python urllib2 module, and Beautiful Soup
  • Learn how to search for information within an HTML/XML page
  • Modify the contents of an HTML tree
  • Understand encoding support in Beautiful Soup
  • Learn about the different types of output formatting

Authors

Table of Contents

Chapter 1: Installing Beautiful Soup
Installing Beautiful Soup
Using Beautiful Soup without installation
Verifying the installation
Quick reference
Summary
Chapter 2: Creating a BeautifulSoup Object
Creating a BeautifulSoup object
Tag
The NavigableString object
Quick reference
Summary
Chapter 3: Search Using Beautiful Soup
Searching in Beautiful Soup
Using search methods to scrape information from a web page
Quick reference
Summary
Chapter 4: Navigation Using Beautiful Soup
Navigation using Beautiful Soup
Quick reference
Summary
Chapter 5: Modifying Content Using Beautiful Soup
Modifying Tag using Beautiful Soup
Modifying string contents
Deleting tags from the HTML document
Special functions to modify content
Quick reference
Summary
Chapter 6: Encoding Support in Beautiful Soup
Encoding in Beautiful Soup
Output encoding
Quick reference
Summary
Chapter 7: Output in Beautiful Soup
Formatted printing
Unformatted printing
Output formatters in Beautiful Soup
Using get_text()
Quick reference
Summary
Chapter 8: Creating a Web Scraper
Getting book details from PacktPub.com
Getting selling prices from Amazon
Getting the selling price from Barnes and Noble
Summary

Book Details

ISBN 139781783289554
Paperback130 pages
Read More

Read More Reviews

Recommended for You

Natural Language Processing: Python and NLTK Book Cover
Natural Language Processing: Python and NLTK
$ 67.99
$ 47.60
Python Requests Essentials Book Cover
Python Requests Essentials
$ 31.99
$ 22.40
Learning Scrapy Book Cover
Learning Scrapy
$ 27.99
$ 19.60
Python 3 Object-oriented Programming - Second Edition Book Cover
Python 3 Object-oriented Programming - Second Edition
$ 39.99
$ 28.00
Python Network Programming Cookbook Book Cover
Python Network Programming Cookbook
$ 26.99
$ 18.90
Mastering matplotlib Book Cover
Mastering matplotlib
$ 31.99
$ 22.40