Free Trial

Safari Books Online is a digital library providing on-demand subscription access to thousands of learning resources.

Share this Page URL

CHAPTER 12 Packet Classification > 12.13 CONCLUSIONS - Pg. 299

12.13 Conclusions 299 Finally, the process stops when all decision tree leaves have no more than binth (bin threshold) rules. binth controls the amount of linear searching at the end of tree search. The HiCuts paper [GM99a] mentions the use of the DAG optimization. A more novel optimization, described in Woo [Woo00] and Gupta and McKeown [GM99a], is to eliminate a rule, R, that completely overlaps another rule, R , at a node but has higher cost. There are also several further degrees of freedom (P13) left unexplored in Gupta and McKeown [GM99a] and Woo [Woo00]: unequal-size cuts at each node, more sophisticated strategies that pick more than field at a time, and linear searching at nodes other than the leaves. A more recent paper [BSV03] takes the decision tree approach a step further by allowing the use of several cuts in a single step via multidimensional array indexing. Because each cut is now a general hypercube, the scheme is called HyperCuts. HyperCuts appears to work significantly faster than HiCuts on many real databases [BSV03]. In conclusion, the decision tree approach described by Woo [Woo00], Gupta and McKe- own [GM99a], and Singh et al. [BSV03] is best viewed as a framework which encompasses a number of potential algorithms. However, experimental evidence [BSV03] shows that this approach works well in practice except on databases that contain a large number of wildcards in one or more fields. The performance of this scheme can be summarized as follows. Assumption: The scheme assumes there is a sufficient number of distinct fields to make reasonable cuts without much storage replication. This rather general observation needs to be sharpened. Performance: The memory required can be kept to roughly linear in the number of rules using the HiCuts heuristics. The tree can be of relatively small height if it is reasonably balanced. Search can easily be pipelined to allow O(1) lookup times. Finally, update can be slow if sophisticated heuristics are used to build the decision tree.