Free Trial

Safari Books Online is a digital library providing on-demand subscription access to thousands of learning resources.

  • Create BookmarkCreate Bookmark
  • Create Note or TagCreate Note or Tag
  • PrintPrint
Share this Page URL
Help

5. Twitter: The Tweet, the Whole Tweet, ... > Analyzing Tweets (One Entity at a Ti...

Analyzing Tweets (One Entity at a Time)

CouchDB makes a great storage medium for collecting tweets because, just like the email messages we looked at in Chapter 3, they are conveniently represented as JSON-based documents and lend themselves to map/reduce analysis with very little effort. Our next example script harvests tweets from time lines, is relatively robust, and should be easy to understand because all of the modules and much of the code has already been introduced in earlier chapters. One subtle consideration in reviewing it is that it uses a simple map/reduce job to compute the maximum ID value for a tweet and passes this in as a query constraint so as to avoid pulling duplicate data from Twitter’s API. See the information associated with the since_id parameter of the time line APIs for more details.


  

You are currently reading a PREVIEW of this book.

                                                                                                                    

Get instant access to over $1 million worth of books and videos.

  

Start a Free Trial


  
  • Safari Books Online
  • Create BookmarkCreate Bookmark
  • Create Note or TagCreate Note or Tag
  • PrintPrint