Safari Books Online is a digital library providing on-demand subscription access to thousands of learning resources.
The easiest approach to our wine pricing problem is the same one you would use if you were trying to price something manually—that is, to find a few of the most similar items and assume the prices will be roughly the same. By finding a set of items similar to the item that interests you, the algorithm can average their prices and make a guess at what the price should be for this item. This approach is called k-nearest neighbors (kNN).
The k in kNN refers to the number of items that will be averaged to get the final result. If the data were perfect, you could use k=1, meaning you would just pick the nearest neighbor and use its price as the answer. But in real-world situations, there are always aberrations. In the example, you deliberately add “noise” to simulate this (the random addition or subtraction of 20 percent). Someone might get a great deal, or an uninformed customer might drastically overpay for the nearest neighbor. For this reason, it’s better to take a few neighbors and average them to reduce any noise.