Free Trial

Safari Books Online is a digital library providing on-demand subscription access to thousands of learning resources.

  • Create BookmarkCreate Bookmark
  • Create Note or TagCreate Note or Tag
  • PrintPrint
Share this Page URL
Help

3. Discovering Groups

Chapter 3. Discovering Groups

Chapter 2 discussed ways to find things that are closely related, so, for example, you could find someone who shares your taste in movies. This chapter expands on those ideas and introduces data clustering, a method for discovering and visualizing groups of things, people, or ideas that are all closely related. In this chapter, you’ll learn: how to prepare data from a variety of sources; two different clustering algorithms; more on distance metrics; simple graphical visualization code for viewing the generated groups; and finally, a method for projecting very complicated datasets into two dimensions.

Clustering is used frequently in data-intensive applications. Retailers who track customer purchases can use this information to automatically detect groups of customers with similar buying patterns, in addition to regular demographic information. People of similar age and income may have vastly different styles of dress, but with the use of clustering, “fashion islands” can be discovered and used to develop a retail or marketing strategy. Clustering is also heavily used in computational biology to find groups of genes that exhibit similar behavior, which might indicate that they respond to a treatment in the same way or are part of the same biological pathway.


  

You are currently reading a PREVIEW of this book.

                                                                                                                    

Get instant access to over $1 million worth of books and videos.

  

Start a Free Trial


  
  • Safari Books Online
  • Create BookmarkCreate Bookmark
  • Create Note or TagCreate Note or Tag
  • PrintPrint