Safari Books Online is a digital library providing on-demand subscription access to thousands of learning resources.
13.3 Router History: From Buses to Crossbars 305 for packet switches to support multicast connections. Multicast complicates the scheduling problem even further because some inputs require sending to multiple outputs. To simplify the problem, most routers internally segment variable-size packets into fixed- size cells before sending to the switch fabric. Mathematically, the switching component of a router reduces to solving a bipartite matching problem: The router must match as many input links as possible (to as many output links as possible) in a fixed cell arrival time. While optimal algorithms for bipartite matching are well known to run in milliseconds, solving the same problem every 8 nsec at 40 Gbps requires some systems thinking. For example, the solutions described in this chapter will trade accuracy for time (P3b), use hardware parallelism (P5) and randomization (P3a), and exploit the fact that typical switches have 3264 ports to build fast priority queue operations using bitmaps (P14). 13.2 SHARED-MEMORY SWITCHES Before describing bus- and crossbar-based switches, it is helpful to consider one of the simplest switch implementations, based on shared memory. Packets are read into a memory from the input links and read out of memory to the appropriate output links. Such designs have been used as part of time slot interchange switches in telephony for years. They also work well for networking for small switches. The main problem is memory bandwidth. If the chip takes in eight input links and has eight output links, the chip must read and write each packet or cell once. Thus the memory has to run at 16 times the speed of each link. Up to a point, this can be solved by using a wide memory access width. The idea is that the bits come in serially on an input link and are accumulated into an input shift register. When a whole cell has been accumulated, the cell can be loaded into the cell-wide memory. Later they can be read out into the output shift register of the corresponding link and be shifted out onto the output link. The Datapath switch design [Kan99, Kes97] uses a central memory of 4K cells, which