1 Introduction
The literature on multicast in multiprocessor systems provides several techniques, with varying complexity and performance tradeoffs. The simplest technique is software multicast, which transforms multicast messages into multiple unicast messages. It is widely recognized that the performance of software multicast is poor. Hardware-based techniques greatly improve performance by supporting multicast with a specialized logic built in the system's nodes. The most important classes of hardware-based multicast algorithms are path-based and tree-based [1], [2], [3], [4], [5], [6], [7], [8], [9], [10], [11], [12].