Free Trial

Safari Books Online is a digital library providing on-demand subscription access to thousands of learning resources.

  • Create BookmarkCreate Bookmark
  • Create Note or TagCreate Note or Tag
  • PrintPrint
Share this Page URL
Help

1. Design Patterns and MapReduce > Hadoop Example: Word Count

Hadoop Example: Word Count

Now that you’re refreshed on the steps of the whole MapReduce process, let’s dive into a quick and simple example. The “Word Count” program is the canonical example in MapReduce, and for good reason. It is a straightforward application of MapReduce and MapReduce can handle it extremely efficiently. Many people complain about the “Word Count” program being overused as an example, but hopefully the rest of the book makes up for that!

In this particular example, we’re going to be doing a word count over user-submitted comments on StackOverflow. The content of the Text field will be pulled out and preprocessed a bit, and then we’ll count up how many times we see each word. An example record from this data set is:


  

You are currently reading a PREVIEW of this book.

                                                                                                                    

Get instant access to over $1 million worth of books and videos.

  

Start a Free 10-Day Trial


  
  • Safari Books Online
  • Create BookmarkCreate Bookmark
  • Create Note or TagCreate Note or Tag
  • PrintPrint