Free Trial

Safari Books Online is a digital library providing on-demand subscription access to thousands of learning resources.

  • Create BookmarkCreate Bookmark
  • Create Note or TagCreate Note or Tag
  • PrintPrint
Share this Page URL
Help

A. Third-Party Libraries > Beautiful Soup

Beautiful Soup

Beautiful Soup is a Python parser for HTML and XML documents. It is designed to work with poorly written web pages. It is used in this book to create datasets from web sites that do not have APIs, and to find all the text on pages for indexing. The home page for this library is http://www.crummy.com/software/BeautifulSoup.

Installation on All Platforms

Beautiful Soup is available as a single file source download. Near the bottom of the home page, there is a link to download BeautifulSoup.py. Simply download this and put it in either your working directory or your Python/Lib directory.

Simple Usage Example

This example parses the HTML of the Google home page, and shows how to extract elements from the DOM and search for links.


  

You are currently reading a PREVIEW of this book.

                                                                                                                    

Get instant access to over $1 million worth of books and videos.

  

Start a Free 10-Day Trial


  
  • Safari Books Online
  • Create BookmarkCreate Bookmark
  • Create Note or TagCreate Note or Tag
  • PrintPrint