Free Trial

Safari Books Online is a digital library providing on-demand subscription access to thousands of learning resources.

  • Create BookmarkCreate Bookmark
  • Create Note or TagCreate Note or Tag
  • PrintPrint

18.4. Data Checking

It is generally worth spending a significant amount of time at every stage of an analysis to make sure that the data is accurate, and geocoding was no different. Errors in geocoding came from a number of sources: there are typographical errors in the addresses, new buildings are often not listed in public databases, and zip codes may be reassigned over time. We further suspect that the USC software included a bug during the period we used it, because large numbers of addresses were falsely assigned to the Los Angeles area and elsewhere around the state; we remapped these addresses using another free online service at http://gpsvisualizer.com. Our debugging process included using R to draw simple maps of latitude versus longitude for each county and most towns to identify the addresses that had been located far outside the Bay Area.

The addresses in San Jose posed an interesting geocoding challenge. Sales are listed for several "towns" that are not recognized by any mapping sites we could find, so we assume they are informal names for neighborhoods: North, South, East and West San Jose, Berryessa, Cambrian, and a few others.


  

You are currently reading a PREVIEW of this book.

                                                                                                                    

Get instant access to over $1 million worth of books and videos.

  

Start a Free Trial


  
  • Safari Books Online
  • Create BookmarkCreate Bookmark
  • Create Note or TagCreate Note or Tag
  • PrintPrint