Free Trial

Safari Books Online is a digital library providing on-demand subscription access to thousands of learning resources.


  • Create BookmarkCreate Bookmark
  • Create Note or TagCreate Note or Tag
  • DownloadDownload
  • PrintPrint
Share this Page URL
Help

7. Google+: TF-IDF, Cosine Similarity, a... > Finding Similar Documents

Finding Similar Documents

Once you’ve queried and discovered documents of interest, one of the next things you might want to do is find similar documents. Whereas TF-IDF can provide the means to narrow down a corpus based on search terms, cosine similarity is one of the most common techniques for comparing documents to one another, which is the essence of finding a similar document. An understanding of cosine similarity requires a brief introduction to vector space models, which is the topic of the next section.


  

You are currently reading a PREVIEW of this book.

                                                                                        

Get instant access to over
$1 million worth of books and videos.

  

Start a Free Trial