Free Trial

Safari Books Online is a digital library providing on-demand subscription access to thousands of learning resources.

  • Create BookmarkCreate Bookmark
  • Create Note or TagCreate Note or Tag
  • PrintPrint
Share this Page URL
Help

A Corpus of News

To begin, you’ll need a set of news articles to work with. These should be from a variety of sources so that the themes being discussed in different places are easier to discern. Fortunately, most of the major news services and web sites provide RSS or Atom feeds, either for all the articles or for individual categories. You’ve used the Universal Feed Parser in previous chapters to parse RSS and Atom feeds for blogs, and you can use the same parser to download news. If you don’t already have the parser, you can download it from http://feedparser.org.

Selecting Sources

There are thousands of sources of what can be considered “news,” from major news wires and newspapers to political blogs. Some ideas include:


  

You are currently reading a PREVIEW of this book.

                                                                                                                    

Get instant access to over $1 million worth of books and videos.

  

Start a Free 10-Day Trial


  
  • Safari Books Online
  • Create BookmarkCreate Bookmark
  • Create Note or TagCreate Note or Tag
  • PrintPrint