I. Introduction
A number of high-performance IP routers (for example, the CISCO 12 000 [1], the Lucent Cajun [2] family, or the Nortel Versalar TSR45000 [3]) are built around fast cell-based switching fabrics. The design of these high-performance routers generally does not adopt the classical output queueing (OQ) architecture (where cells are stored at the output of the switching fabric), preferring either input queueing (IQ) or combined input/output queueing (CIOQ) structures. The reason is that, in OQ, both the switching fabric and the output (and possibly input) queues in line cards must operate at a speed equal to the sum of the rates of all input lines; since this speed grows linearly with the number of switch ports, the OQ approach is impractical for large switches. Instead, in IQ schemes, all the components of the switch (input interfaces, switching fabric, output interfaces) can operate at a data rate which is compatible with the data rate of input and output lines, and does not grow with the switch size. The traditional performance penalty of IQ architectures is due to head-of-the-line blocking in the case of a single queue per input interface [4], but can be largely reduced by virtual output queueing (VOQ) (also called destination queueing) schemes [5], which organize input buffers in each line card into a set of queues where cells awaiting access to the switching fabric are stored according to their destination output cards.