A 3.8 mW/Gbps Quad-Channel 8.5–13 Gbps Serial Link With a 5 Tap DFE and a 4 Tap Transmit FFE in 28 nm CMOS | IEEE Journals & Magazine | IEEE Xplore

A 3.8 mW/Gbps Quad-Channel 8.5–13 Gbps Serial Link With a 5 Tap DFE and a 4 Tap Transmit FFE in 28 nm CMOS


Abstract:

This paper presents a quad-lane serial transceiver that supports virtually all data center communication standards around 8.5-13 Gbps, implemented in 28 nm CMOS technolog...Show More

Abstract:

This paper presents a quad-lane serial transceiver that supports virtually all data center communication standards around 8.5-13 Gbps, implemented in 28 nm CMOS technology. The transmitter consists of 20:2 mux followed by a half-rate source-series terminated (SST) driver embedded with a 4 tap FFE and an analog equalizer. The receiver has an adaptive CTLE, 5 tap DFE, and fully digital CDR followed by 2:20 demux. At 13 Gbps, the transceiver can equalize 35 dB Nyquist loss at BER of 10-12. At 1.0 V supply, the transceiver consumes 49 mW/lane at 13 Gbps rate with full equalization capability. An LC VCO-based fractional PLL provides the clocking to quad TX/RX lanes using a low-power inductively tuned clock routing channel. The transceiver architecture not only enables the baud rate operation from 8.5 to 13 Gbps but also supports a wide range of oversampled subrates. This work represents the lowest reported power in its class to date, and the transceiver is suitable for many applications due to its comprehensive flexibility and power efficiency.
Published in: IEEE Journal of Solid-State Circuits ( Volume: 51, Issue: 4, April 2016)
Page(s): 881 - 892
Date of Publication: 18 February 2016

ISSN Information:


I. Introduction

In recent years, the data rates as well as number of IOs have increased drastically for servers, switches, line cards, and backplane connections inside the data center. Thanks to an increasing number of worldwide internet users demanding faster communications and richer media content, the data center aggregate bandwidth is doubling at approximately every 2 years (Fig. 1). In data center switch chips, the throughput reached 1.28 Tbps utilizing 128 ports running at 10 Gbps. However, the total available rack power and data center cooling requirements pose a strict upper limit on power per lane. Although faster links beyond 25 Gbps are starting to replace the 10 Gbps ports, 10 Gbps is still the industry’s mainstream data rate occupying 89% of the ports for existing data centers. Over the last decade, there was a proliferation of communication link standards running around 10 Gbps. Some of the widely adopted 10G standards are: 10G SFP and 10G CX1 at chip-to-module interface, 40G XLAUI and 10G XFI at chip-to-chip interface, 40G nPPI and 40G CR4 at front-panel IO to line card, and 10G KR for backplanes (Fig. 2). Research on various 10G links has been reported in which some designs targeted short to mid-range operation [1]–[9], whereas others emphasized the equalization for high insertion loss () channels [10]–[18]. It is desirable to have a unified transceiver that can support multiple standards. On the other hand, the tradeoff between equalization capability and power efficiency creates a behemoth barrier to design this utopian transceiver capable of equalizing more than 30 dB link loss for KR backplanes, while it can also be used in a short-reach link such as 10G XFI or any integer subrate (1.25/2.5/5 Gbps) without breaking the power budget [19]. In theory, CMOS logic is the most suitable for low power operation due to its almost zero current drain during no transition times, assuming that the leakage is not significant. In this work, we present design choices and implementation details that utilize CMOS logic operation as much as possible. Special care must be taken with the circuits at CML to CMOS interfaces to prevent performance degradation. Thus, our transceiver achieves the highest performance with the lowest power compared to published similar links. This paper is organized as follows. Section II briefly describes the overall transceiver architecture. In Section III, the source-series terminated (SST) driver with an embedded 4 tap FFE as well as an analog equalizer is presented. Section IV provides the circuit details of the RX path. Section V outlines the key features of the PLL and clock distribution network. Section VI presents the measurement results and compares with the prior work. Section V states our conclusion.

Bandwidth growth in data center.

Widely adopted communication standards with corresponding applications.

References

References is not available for this document.