Loading [MathJax]/extensions/MathMenu.js
Sudhakar Pamarti - IEEE Xplore Author Profile

Showing 1-25 of 118 results

Filter Results

Show

Results

Low-temperature (LT) conditions can potentially lead to lower power consumption and enhanced performance in circuit operations by reducing the transistor leakage current, increasing carrier mobility, reducing wear-out, and reducing interconnect resistance. We develop PROCEED-LT, a pathfinding framework to co-optimize devices and circuits over a wide performance range. Our results demonstrate that ...Show More
Next-generation data-center computing requires high-performance energy-efficient servers. One counterintuitive approach to reduce energy is to lower the temperature of the processing elements even down to cryogenic temperatures of liquid nitrogen. If the processing load is reduced dramatically, the net energy cost is lower even considering the corresponding cooling cost. Operating at such temperat...Show More
Event-based cameras offer low latency and high-dynamic range imaging data in a sparse format that is well-suited for high-speed object tracking. Processing this sparse data in the same way as traditional camera data requires a great deal of unnecessary computation, making it difficult to take advantage of the high-effective frame rate for real-time processing. In this work, we propose an accelerat...Show More
In this article, we describe an advanced multi-output fractional frequency synthesizer (FFS) featuring an innovative digital spur cancellation technique. This technique not only effectively suppresses fractional- $N$ spurs but also eliminates externally coupled spurious tones. In addition, this article includes a comprehensive exploration of the proposed method, offering theoretical analysis and ...Show More
The filtering-by-aliasing (FA) receivers have demonstrated sharp analog finite impulse response (FIR) filtering by combining periodically time-varying (PTV) circuit elements with uniform sampling. Seen at the sampled output, the impulse response of an FA receiver can be controlled by the PTV resistor’s resistance that varies over time, which realizes very sharp filtering with 50–70-dB stopband rej...Show More
The design cycle of analog and mixed signal (AMS) components requires the designer to iteratively perform analog simulations, layout, fabrication, and hardware testing. Unlike digital designs, system verification is a difficult task in analog designs, primarily due to a lack of emulation. Thus, a method to emulate AMS components on digital hardware would be highly beneficial. In this work, a high-...Show More
Dynamic element matching (DEM) is a popular class of techniques that exploit redundancy to linearize digital-to-analog converters (DACs) in the presence of component mismatches. Most DEM techniques employ spectral shaping of the DAC mismatch error power, often in conjunction with randomization, to suppress in-band error and spurious tones. However, spurious tone elimination is mathematically guara...Show More
Deep learning has grown in capability and size in recent years, prompting research on alternative computing methods to cope with the increased compute cost. Stochastic computing (SC) promises higher compute efficiency with its compact compute units, but accuracy issues have prevented wide adoption, and accuracy-improving techniques have sacrificed runtime or training performance. In this work, we ...Show More
Compute-in-memory (CIM) accelerator has become a popular solution to achieve high energy efficiency for deep learning applications in edge devices. Recent works have demonstrated CIM macros using nonvolatile memories [spin transfer torque (STT)-MRAM and resistive random access memory (RRAM)] to take advantages of their nonvolatility and high density. However, effective computation dynamic range is...Show More
A100 GHz wideband static divider is implemented on a 0. 1S$\mu$mSiGe BiCMOS technology. The divider achieves a self-resonant frequency (SRF) of 92.5 GHz with a maximum dividing frequency of 100 GHz. The required input power is less than 0dBm across the entire operating range. The divider consumes 66 mW. The circuit has a 100 x SO $\mu \mathrm{m}^{2}$ active areaShow More
Voltage-controlled (VC) spin-orbit-torque (SOT) magnetic random access memory (MRAM) is being considered as the next-generation magnetic memory with potential to achieve superior speed, power, and write error rates over existing MRAM technologies. By placing multiple VC devices on a single SOT bus, VC-SOT MRAM can also enable compact structures, in which multiple devices can be addressed individua...Show More
Most analog compute-in-memory (CIM) devices suffer from low on/off ratio, large IR drop, limited retention capability, the need for an additional access device, and additional process complexity for embedding in commercial CMOS logic. To overcome these disadvantages, the charge-trap transistor (CTT) has been proposed. It uses commercial off-the-shelf technology (22 FDX demonstrated in this work) w...Show More
High compute density improves the data reuse and is the key to reducing off-chip memory access and achieving high energy efficiency in ML accelerators. Compute-in-Memory (CIM) promises high compute density but requires ADCs, DACs that add to the macro’s energy and area [1] [2] limiting its compute density. Besides, CIM’s analog compute is sensitive to process variability and mismatches. The transi...Show More
An 86.6-dB SFDR, 1-GS/s differential bootstrap sampler in a 0.18-um SiGe BiCMOS technology is presented. The performance is achieved using an amplitude-modulated bootstrap circuit. The results show 14-bit linearity over nearly 500-MHz bandwidth, while consuming less power compared to a conventional MOSFET switched-capacitor bootstrap circuit due to less parasitic capacitance and the use of high ft...Show More
There has been a growing trend of developing programmable transceivers, which are the key to realizing a true software-defined radio [1]. Some prior works, such as N-path filters and mixer-first receivers, utilized periodically time-varying (PTV) circuits and have shown some promises and achieved what conventional time-invariant circuits are incapable of. Among them, the filtering-by-aliasing (FA)...Show More
We present the first programmable and precision-tunable stochastic computing (SC) neural network (NN) inference accelerator. The use of SC makes it possible to achieve multiply–accumulate (MAC) density of 38.4k MAC/mm2, enabling a level of spatial data reuse unachievable to conventional, fixed-point architectures. This extensive reuse amortizes the cost of SC conversion and reduces the number of m...Show More
A filtering-by-aliasing (FA) receiver front-end based on a slice-based time-varying architecture was described by Bu and Pamarti (2021). Unlike prior FA architectures, it demonstrated, using a 28-nm CMOS prototype IC, a time-invariant input impedance that enables dual-channel operation with high linearity. Up to 50-dB stopband rejection with a transition bandwidth (BW) of only 3.2 times the RF BW,...Show More
In this paper, we propose an in-memory True Random Number Generator (TRNG) using Voltage-Controlled MRAM that doesn't require calibration of the writing pulse's width and amplitude. Previous solution using Spin Transfer Torque (STT) MRAM requires calibration for every MTJ, thus making the multi-row random number generation inside the memory impossible. We also propose a 100% relative throughput di...Show More
Waferscale processor systems can provide the large number of cores, and memory bandwidth required by today’s highly parallel workloads. One approach to building waferscale systems is to use a chiplet-based architecture where pre-tested chiplets are integrated on a passive silicon-interconnect wafer. This technology allows heterogeneous integration and can provide significant performance and cost b...Show More
In this paper, we propose an in-memory True Random Number Generator (TRNG) using Voltage-Controlled MRAM that doesn't require calibration of the writing pulse's width and amplitude. Previous solution using Spin Transfer Torque (STT) MRAM requires calibration for every MTJ, thus making the multi-row random number generation inside the memory impossible. We also propose a 100% relative throughput di...Show More
This paper presents a 27.0-30.5 GHz sub-sampling PLL implemented in a 0.18um SiGe BiCMOS process. The PLL incorporates an improved Class-C VCO with a supply-side 2nd-harmonic common-mode resonance and a completely floating, transformer-based 3rd-harmonic differential-mode resonance. Despite the limitation of the 0.18µm CMOS devices at high frequencies, the prototype achieves record low in-band noi...Show More
Demand for large amounts of parallelism is growing rapidly for today's computing systems. This is due to the proliferation of applications such as graph processing, data analytics, machine learning, etc. which require a large number of processing cores and a large amount of memory bandwidth. Often systems comprising of many individual packaged chips are employed to run these applications. However,...Show More
Stochastic computing (SC) has seen a renaissance in recent years as a means for machine learning acceleration due to its compact arithmetic and approximation properties. Still, SC accuracy remains an issue, with prior works either not fully utilizing the computational density or suffering from significant accuracy losses. In this work, we propose GEO - Generation and Execution Optimized Stochastic...Show More
Programmable receivers have drawn a lot of attention in recent years, especially those exploiting periodically time-varying (PTV) circuits. N-path filters and mixer-first receivers [1- 3] achieve sharp filtering and good linearity but can suffer from high LO leakage (> -70 dBm), which is not compliant with the FCC requirements [4]. In addition, they do not support multi-carrier operation very well...Show More
This article presents a periodically time-varying (PTV) noise cancellation technique for filtering-by-aliasing (FA) receivers. The key to the proposed technique is the use of a time-varying transconductance (Gm) cell to sense the noise generated by the PTV resistor in an FA receiver while maintaining the sharp filtering offered by FA. A prototype IC fabricated in a 28-nm CMOS process improves the ...Show More