Extract and manipulate HTML data using Jsoup with Packt’s new Instant eBook

July 2013 | Open Source

Packt is pleased to announce the publication of Instant jsoup How-to, a concise guide that is packed with practical, step-by-step instructions and clear explanations, helping readers manipulate HTML quickly and effectively. This book is now available in all the popular formats, Amazon, Kindle, e-pub, and PDF. The eBook comes in at over 38 pages and is competitively priced at $11.04.

About the authors:

Pete Houston

Pete Houston is a software engineer from South Korea with 10 years of experience in software design and development. He has undertaken research on medical imaging that helps in diagnosing symptoms of cancer in patients. He has worked with C, C++, and COM/DLL, ActiveX Control, and C #.NET 3.0. He also designed and architected the Android mobile platform.

jsoup is a Java library for working with real-world HTML. It provides a very convenient API for extracting and manipulating data, using the best of DOM, CSS, and jquery-like methods. It implements the WHATWG HTML5 specification, and parses HTML to the same DOM as modern browsers do.

Instant jsoup How-to provides the reader with detailed instructions on how to use the Jsoup library to manipulate HTML content based on different needs. This concise guide will teach readers the basic aspects of data crawling, as well as the various concepts of Jsoup, empowering them to make the best use of the library to achieve their goals.

The book covers the following areas:

• Parse HTML from a URL, a file, or a string

• Find data using DOM or CSS selectors

• Manipulate the HTML elements, attributes, and text

• Sanitize data to prevent XSS attacks

• Understand various methods to configure your library for better results

Instant jsoup How-to is full of illustrations and tips with clear step-by-step instructions and practical examples. To find out more about the book, and to check eBook purchasing options, please visit the Packt book-page.


Instant jsoup How-to

Effectively extract and manipulate HTML content with the jsoup library

For more information, please visit the book page

Code Download and Errata
Packt Anytime, Anywhere
Register Books
Print Upgrades
eBook Downloads
Video Support
Contact Us
Awards Voting Nominations Previous Winners
Judges Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software
Resources
Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software