Safari Books Online is a digital library providing on-demand subscription access to thousands of learning resources.
In Chapter 7, we considered the task of extracting the headline link URLs from the BBC News main page, and we implemented it in terms of HTML::TokeParser. Here, we'll consider the same problem from the perspective of HTML::TreeBuilder.
To review the problem: when you look at the source of http://news.bbc.co.uk, you discover that each headline link is wrapped in one of two kinds of code. There are a lot of headlines expressed with code like this: