Safari Books Online is a digital library providing on-demand subscription access to thousands of learning resources.
246 CHAPTER6 Implementations: Real Machine Learning Schemes been used for prediction if its performance record had been good enough), its success statistics are updated as though it had been used to classify that new instance. To accomplish this, we use the confidence limits on the success probability of a Bernoulli process that we derived in Section 5.2. Recall that we took a certain number of successes S out of a total number of trials N as evidence on which to base confidence limits on the true underlying success rate p. Given a certain confidence level of, say, 5%, we can calculate upper and lower bounds and be 95% sure that p lies between them. To apply this to the problem of deciding when to accept a particular exemplar, suppose that it has been used n times to classify other instances and that s of these have been successes. That allows us to estimate bounds, at a particular confidence level, on the true success rate of this exemplar. Now suppose that the exemplar's class has occurred c times out of a total number N of training instances. This allows us to estimate bounds on the default success rate--that is, the probability of suc- cessfully classifying an instance of this class without any information about other instances. We insist that the lower confidence bound on an exemplar's success rate exceeds the upper confidence bound on the default success rate. We use the same method to devise a criterion for rejecting a poorly performing exemplar, requiring that the upper confidence bound on its success rate lies below the lower confidence bound on the default success rate. With suitable choices of thresholds, this scheme works well. In a particular implementation, called IB3 for Instance-Based Learner version 3, a confidence level of 5% is used to determine acceptance whereas a level of 12.5% is used for rejec- tion. The lower percentage figure produces a wider confidence interval, which makes for a more stringent criterion because it is harder for the lower bound of one interval to lie above the upper bound of the other. The criterion for acceptance is more stringent than for rejection, making it more difficult for an instance to be accepted. The reason for a less stringent rejection criterion is that there is little to be lost by dropping instances with only moderately poor classification accuracies: They will probably be replaced by similar instances later. Using these thresholds has been found to improve the performance of instance-based learning and, at the same time, dramatically reduce the number of exemplars--particularly noisy exemplars--that are stored. WeightingAttributes The Euclidean distance function, modified to scale all attribute values to between 0 and 1, works well in domains in which the attributes are equally relevant to the outcome. Such domains, however, are the exception rather than the rule. In most domains some attributes are irrelevant and some relevant ones are less important than others. The next improvement in instance-based learning is to learn the rele- vance of each attribute incrementally by dynamically updating feature weights. In some schemes, the weights are class specific in that an attribute may be more important to one class than to another. To cater for this, a description is produced