Niagara Overview
The Niagara approach to increasing throughput on commercial server applications involves a dramatic increase in the number of threads supported on the processor and a memory subsystem scaled for higher band-widths. Niagara supports 32 threads of execution in hardware. The architecture organizes four threads into a thread group; the group shares a processing pipeline, referred to as the Sparc pipe. Niagara uses eight such thread groups, resulting in 32 threads on the CPU. Each SPARC pipe contains level-1 caches for instructions and data. The hardware hides memory and pipeline stalls on a given thread by scheduling the other threads in the group onto the SPARC pipe with a zero cycle switch penalty. Figure 1 schematically shows how reusing the shared processing pipeline results in higher throughput.