Free Trial

Safari Books Online is a digital library providing on-demand subscription access to thousands of learning resources.


  • Create BookmarkCreate Bookmark
  • Create Note or TagCreate Note or Tag
  • DownloadDownload
  • PrintPrint
Share this Page URL
Help

Chapter 12. Data processing with Clojure > The map/reduce paradigm

12.1. The map/reduce paradigm

Google popularized the map/reduce approach to distributed computing where large volumes of data can be processed using a large number of computers. The data processing problem is broken into pieces, and each piece runs on an individual machine. The software then combines the output from each computer to produce a final answer. The breaking up of the problem into smaller problems and assigning them to computers happens in the map stage, whereas the output from individual computers is taken and combined into a single entity in the reduce stage.

Google’s map/reduce is based on the functional concepts of map and reduce, functions that you’ve seen repeatedly in this book so far. In this section, we’ll explore this combination of map and reduce to see how it can be useful in processing data. We’ll use the basic ideas of mapping and reducing, and over the course of this section we’ll process data that we read from files. We’ll build abstractions on top of simple file input so that we eventually end up processing Ruby on Rails server log files.


  

You are currently reading a PREVIEW of this book.

                                                                                        

Get instant access to over
$1 million worth of books and videos.

  

Start a Free Trial