Safari Books Online is a digital library providing on-demand subscription access to thousands of learning resources.
As a starting point, consider the task of harvesting a month's worth of listings and corresponding RealAudio URLs from the web site of the National Public Radio program Fresh Air, at http://freshair.npr.org. Fresh Air is on NPR stations each weekday, and on every show, different guests are interviewed. The show's web site lists which guests appear on the show each day and has links to the RealAudio files for each segment of each show. If your particular weekday schedule doesn't have you listening to Fresh Air every night or afternoon, you would find it useful to have a program tell you who had been on in the past month, so you could make a point of listening to the RealAudio files for the guests you find interesting. Such a data-extraction program could be scheduled with crontab to run on the first or second day of every month, to harvest the past month's program data.