Free Trial

Safari Books Online is a digital library providing on-demand subscription access to thousands of learning resources.


  • Create BookmarkCreate Bookmark
  • Create Note or TagCreate Note or Tag
  • DownloadDownload
  • PrintPrint

9.4. Example: BBC News

In Chapter 7, we considered the task of extracting the headline link URLs from the BBC News main page, and we implemented it in terms of HTML::TokeParser. Here, we'll consider the same problem from the perspective of HTML::TreeBuilder.

To review the problem: when you look at the source of http://news.bbc.co.uk, you discover that each headline link is wrapped in one of two kinds of code. There are a lot of headlines expressed with code like this:


  

You are currently reading a PREVIEW of this book.

                                                                                        

Get instant access to over
$1 million worth of books and videos.

  

Start a Free Trial


  
  • Safari Books Online
  • Create BookmarkCreate Bookmark
  • Create Note or TagCreate Note or Tag
  • DownloadDownload
  • PrintPrint