Loading [a11y]/accessibility-menu.js
Comprehensive and Efficient Design Parameter Selection for Soft Error Resilient Processors via Universal Rules | IEEE Journals & Magazine | IEEE Xplore

Comprehensive and Efficient Design Parameter Selection for Soft Error Resilient Processors via Universal Rules


Abstract:

Soft errors have been significantly degrading the reliability of current processors whose feature sizes and supply voltages are fast scaling down. In this paper, we propo...Show More

Abstract:

Soft errors have been significantly degrading the reliability of current processors whose feature sizes and supply voltages are fast scaling down. In this paper, we propose two effective approaches to characterize processor reliability against soft errors at presilicon stage. By utilizing a rule search strategy named Patient Rule Induction Method (PRIM), we are capable of generating a set of selective rules on key design parameters. These rules quantify the design space subregion with the lowest effective soft error rate (SER), thus providing useful guidelines in designing reliable processors. Furthermore, we also propose to use Classification and Regression Trees (CART) to partition the design space into a number of small subregions each being associated with a representative SER value. This gives the processor designer a global view of the SER distribution, enabling a comprehensive analysis over the entire design space. More importantly, both approaches generate “universal” models whose effectiveness is validated with a set of test programs unseen to training. Compared to traditional application-specific design space studies, our models’ cross-program capability can save great training effort in the era of multithreading. Finally, a case study on multiprocessors is performed to simultaneously balance multiple design metrics, including reliability, performance, and power.
Published in: IEEE Transactions on Computers ( Volume: 63, Issue: 9, September 2014)
Page(s): 2201 - 2214
Date of Publication: 31 May 2013

ISSN Information:


I. 1 Introduction

Soft errors have become an important factor in degrading the reliability of current high-performance processors. They occur mainly due to the electronic noises caused by energetic nuclear particles, such as alpha-particles, neutrons, and pions, from the environment [46]. These particles may invert the state of a logic device (from ‘0’ to ‘1’, or from ‘1’ to ‘0’) when the resulted charge has been accumulated to a sufficient amount, introducing soft errors (or transient faults) into the system. With the feature size and supply voltage scaling down to extremely small values, current processors become highly vulnerable to soft errors [4], [20], [30], [32], [33], [39], [42], [44].

Contact IEEE to Subscribe

References

References is not available for this document.