Free Trial

Safari Books Online is a digital library providing on-demand subscription access to thousands of learning resources.

  • Create BookmarkCreate Bookmark
  • Create Note or TagCreate Note or Tag
  • PrintPrint
Share this Page URL
Help

8. Building Price Models > Cross-Validation

Cross-Validation

Cross-validation is the name given to a set of techniques that divide up data into training sets and test sets. The training set is given to the algorithm, along with the correct answers (in this case, prices), and becomes the set used to make predictions. The algorithm is then asked to make predictions for each item in the test set. The answers it gives are compared to the correct answers, and an overall score for how well the algorithm did is calculated.

Usually this procedure is performed several times, dividing the data up differently each time. Typically, the test set will be a small portion, perhaps 5 percent of the all the data, with the remaining 95 percent making up the training set. To start, create a function called dividedata in numpredict.py, which divides up the dataset into two smaller sets given a ratio that you specify:


  

You are currently reading a PREVIEW of this book.

                                                                                                                    

Get instant access to over $1 million worth of books and videos.

  

Start a Free 10-Day Trial


  
  • Safari Books Online
  • Create BookmarkCreate Bookmark
  • Create Note or TagCreate Note or Tag
  • PrintPrint