A new approach to fault-tolerant routing algorithm on SLmesh | IEEE Conference Publication | IEEE Xplore

A new approach to fault-tolerant routing algorithm on SLmesh


Abstract:

With the increasing scale of integrated circuits, fault appears more than ever in the chips. It's significant to apply fault-tolerant routing algorithms. In this paper, w...Show More

Abstract:

With the increasing scale of integrated circuits, fault appears more than ever in the chips. It's significant to apply fault-tolerant routing algorithms. In this paper, we proposed an algorithm based on spare links mesh which will fully utilizes the idle ports while keeping the router size unchanged. When a faulty router is detected, its neighboring routers change the links of their idle ports to route the packets around the faulty router. As a result, XY routing algorithm in mesh can work even a faulty router exist in the network. Furthermore, the algorithm can provide partially-adaptive routing in some neighboring routers around the faulty router. Thus, these neighboring routers won't be easily congested and it may reduce the latency by decreasing some hops of the packets. The experimental results show that this algorithm is feasible and the performance of delay and throughput of the network is improved.
Date of Conference: 09-11 November 2012
Date Added to IEEE Xplore: 02 May 2013
ISBN Information:
Conference Location: Chengdu

I. Introduction

With the great development of the VLSI technology, it enables integration of an increasing number of IP cores on a single die. Tens of cores are possible on a single chip multiprocessor (CMP), such as Intel's terascale processor [1] and Tilera's TILE64 processor [2]. Intel has built their Single-chip Cloud Computer CMP with 48 cores [3] and research chips with 80 cores [4]. Therefore, the number of the router is also increasing. But according to the prediction from an Intel commentator in [5], in the 100-billion transistor chips “20 billion of those transistors will fail in manufacture and a further 10 billion will fail in the first year of operation”. Thus, this high device failure rate means we must consider the fault-tolerant routing algorithm in the NoC.

Contact IEEE to Subscribe

References

References is not available for this document.