I. Introduction
Traditionally, computer researchers have used the geometric mean (GM) of performance ratios of two computers running a set of selected benchmarks to compare their relative performances. This approach, however, is limited by the variability of computer systems which stems from non-deterministic hardware and software behaviors [1] [12], or deterministic behaviors such as measurement bias [20]. The situation is exacerbated by increasingly complicated architectures and programs. Wrong conclusions could be drawn if variability is not handled correctly. Using a simple geometric mean cannot describe the performance variability of computers.