I. Introduction
In recent years, the data rates as well as number of IOs have increased drastically for servers, switches, line cards, and backplane connections inside the data center. Thanks to an increasing number of worldwide internet users demanding faster communications and richer media content, the data center aggregate bandwidth is doubling at approximately every 2 years (Fig. 1). In data center switch chips, the throughput reached 1.28 Tbps utilizing 128 ports running at 10 Gbps. However, the total available rack power and data center cooling requirements pose a strict upper limit on power per lane. Although faster links beyond 25 Gbps are starting to replace the 10 Gbps ports, 10 Gbps is still the industry’s mainstream data rate occupying 89% of the ports for existing data centers. Over the last decade, there was a proliferation of communication link standards running around 10 Gbps. Some of the widely adopted 10G standards are: 10G SFP and 10G CX1 at chip-to-module interface, 40G XLAUI and 10G XFI at chip-to-chip interface, 40G nPPI and 40G CR4 at front-panel IO to line card, and 10G KR for backplanes (Fig. 2). Research on various 10G links has been reported in which some designs targeted short to mid-range operation [1]–[9], whereas others emphasized the equalization for high insertion loss () channels [10]–[18]. It is desirable to have a unified transceiver that can support multiple standards. On the other hand, the tradeoff between equalization capability and power efficiency creates a behemoth barrier to design this utopian transceiver capable of equalizing more than 30 dB link loss for KR backplanes, while it can also be used in a short-reach link such as 10G XFI or any integer subrate (1.25/2.5/5 Gbps) without breaking the power budget [19]. In theory, CMOS logic is the most suitable for low power operation due to its almost zero current drain during no transition times, assuming that the leakage is not significant. In this work, we present design choices and implementation details that utilize CMOS logic operation as much as possible. Special care must be taken with the circuits at CML to CMOS interfaces to prevent performance degradation. Thus, our transceiver achieves the highest performance with the lowest power compared to published similar links. This paper is organized as follows. Section II briefly describes the overall transceiver architecture. In Section III, the source-series terminated (SST) driver with an embedded 4 tap FFE as well as an analog equalizer is presented. Section IV provides the circuit details of the RX path. Section V outlines the key features of the PLL and clock distribution network. Section VI presents the measurement results and compares with the prior work. Section V states our conclusion.
Bandwidth growth in data center.
Widely adopted communication standards with corresponding applications.