Free Trial

Safari Books Online is a digital library providing on-demand subscription access to thousands of learning resources.


  • Create BookmarkCreate Bookmark
  • Create Note or TagCreate Note or Tag
  • DownloadDownload
  • PrintPrint
Share this Page URL
Help

Clustering Contacts by Job Title > Standardizing and Counting Job Titles

Standardizing and Counting Job Titles

An obvious starting point when working with a new data set is to count things, and this situation is no different. This section introduces a pattern for transforming common job titles and then performs a basic frequency analysis.

Assuming you have a reasonable number of exported contacts, the minor nuances among job titles that you’ll encounter may actually be surprising, but before we get into that, let’s put together some code that establishes some patterns for normalizing record data and takes a basic inventory sorted by frequency. Example 6-2 inspects job titles and prints out frequency information for the titles themselves and for individual tokens that occur in them. If you haven’t already, you’ll need to easy_install pretty table , a package that you can use to produce nicely formatted tabular output.


  

You are currently reading a PREVIEW of this book.

                                                                                        

Get instant access to over
$1 million worth of books and videos.

  

Start a Free Trial


  
  • Safari Books Online
  • Create BookmarkCreate Bookmark
  • Create Note or TagCreate Note or Tag
  • DownloadDownload
  • PrintPrint