Free Trial

Safari Books Online is a digital library providing on-demand subscription access to thousands of learning resources.

  • Create BookmarkCreate Bookmark
  • Create Note or TagCreate Note or Tag
  • PrintPrint
Share this Page URL
Help

2. Summarization Patterns > Counting with Counters

Counting with Counters

Pattern Description

This pattern utilizes the MapReduce framework's counters utility to calculate a global sum entirely on the map side without producing any output.

Intent

An efficient means to retrieve count summarizations of large data sets.

Motivation

A count or summation can tell you a lot about particular fields of data, or your data as a whole. Hourly ingest record counts can be post processed to generate helpful histograms. This can be executed in a simple “word count” manner, in that for each input record, you output the same key, say the hour of data being processed, and a count of 1. The single reduce will sum all the input values and output the final record count with the hour. This works very well, but it can be done more efficiently using counters. Instead of writing any key value pairs at all, simply use the framework’s counting mechanism to keep track of the number of input records. This requires no reduce phase and no summation! The framework handles monitoring the names of the counters and their associated ....


  

You are currently reading a PREVIEW of this book.

                                                                                                                    

Get instant access to over $1 million worth of books and videos.

  

Start a Free 10-Day Trial


  
  • Safari Books Online
  • Create BookmarkCreate Bookmark
  • Create Note or TagCreate Note or Tag
  • PrintPrint