Mihir Choudhury - IEEE Xplore Author Profile

Showing 1-25 of 59 results

Filter Results

Show

Results

Diffusion models have emerged as a powerful tool for point cloud generation. A key component that drives the impressive performance for generating high-quality samples from noise is iteratively denoise for thousands of steps. While beneficial, the complexity of learning steps has limited its applications to many 3D real-world. To address this limitation, we propose Point Straight Flow (PSF), a mod...Show More
Vision Transformers (ViTs) have emerged with superior performance on computer vision tasks compared to the convolutional neural network (CNN)-based models. However, ViTs mainly designed for image classification will generate single-scale low-resolution representations, which makes dense prediction tasks such as semantic segmentation challenging for ViTs. Therefore, we propose HRViT, which enhances...Show More
From wearables to powerful smart devices, modern automatic speech recognition (ASR) models run on a variety of edge devices with different computational budgets. To navigate the Pareto front of model accuracy vs model size, researchers are trapped in a dilemma of optimizing model accuracy by training and fine-tuning models for each individual edge device while keeping the training GPU-hours tracta...Show More
We propose an efficient neural network for RAW image denoising. Although neural network-based denoising has been extensively studied for image restoration, little attention has been given to efficient denoising for compute limited and power sensitive devices, such as smartphones and wearables. In this paper, we present a novel architecture and a suite of training techniques for high quality denois...Show More
Data augmentation (DA) is an essential technique for training state-of-the-art deep learning systems. In this paper, we empirically show that the standard data augmentation methods may introduce distribution shift and consequently hurt the performance on unaugmented data during inference. To alleviate this issue, we propose a simple yet effective approach, dubbed KeepAugment, to increase the fidel...Show More
Neural architecture search (NAS) has shown great promise in designing state-of-the-art (SOTA) models that are both accurate and efficient. Recently, two-stage NAS, e.g. BigNAS, decouples the model training and searching process and achieves remarkable search efficiency and accuracy. Two-stage NAS requires sampling from the search space during training, which directly impacts the accuracy of the fi...Show More
We present DIAN, a Differentiable Accelerator-Network Co-Search framework for automatically searching for matched networks and accelerators to maximize both the accuracy and efficiency. Specifically, DIAN integrates two enablers: (1) a generic design space for DNN accelerators that is applicable to both FPGA- and ASIC-based DNN accelerators; and (2) a joint DNN network and accelerator co-search al...Show More
Recurrent transducer models have emerged as a promising solution for speech recognition on the current and next generation smart devices. The transducer models provide competitive accuracy within a reasonable memory footprint alleviating the memory capacity constraints in these devices. However, these models access parameters from off-chip memory for every input time step which adversely effects d...Show More
Emerging AI-enabled applications such as augmented and virtual reality (AR/VR) leverage multiple deep neural network (DNN) models for various sub-tasks such as object detection, image segmentation, eye-tracking, speech recognition, and so on. Because of the diversity of the sub-tasks, the layers within and across the DNN models are highly heterogeneous in operation and shape. Diverse layer operati...Show More
Neural network accelerator is a key enabler for the on-device AI inference, for which energy efficiency is an important metric. The datapath energy, including the computation energy and the data movement energy among the arithmetic units, claims a significant part of the total accelerator energy. By revisiting the basic physics of the arithmetic logic circuits, we show that the datapath energy is ...Show More
Neural Architecture Search (NAS) has demonstrated its power on various AI accelerating platforms such as Field Programmable Gate Arrays (FPGAs) and Graphic Processing Units (GPUs). However, it remains an open problem how to integrate NAS with Application-Specific Integrated Circuits (ASICs), despite them being the most powerful AI accelerating platforms. The major bottleneck comes from the large d...Show More
Hardware acceleration of Deep Neural Networks (DNNs) aims to tame their enormous compute intensity. Fully realizing the potential of acceleration in this domain requires understanding and leveraging algorithmic properties of DNNs. This paper builds upon the algorithmic insight that bitwidth of operations in DNNs can be reduced without compromising their classification accuracy. However, to prevent...Show More
Fault attack becomes a serious threat to system security and requires to be evaluated in the design stage. Existing methods usually ignore the intrinsic uncertainty in attack process and suffer from low scalability. In this paper, we develop a general framework to evaluate system vulnerability against fault attack. A holistic model for fault injection is incorporated to capture the probabilistic n...Show More
Modelling of the interaction between Hot Carrier Aging (HCA) and Positive Bias Temperature Instability (PBTI) has been considered as one of the main challenges in nanoscale CMOS circuit design. Previous works were mainly based on separate HCA and PBTI instead of Interacted HCA-PBTI Degradation (IHPD). The key advance of this work is to develop a methodology that enables accurate modelling of IHPD ...Show More
Soft error is one of the major threats for resilient computing. Unlike SRAM soft error, which can be effectively protected by ECC, Flip-Flop soft error protection can be costly. We observe that flip-flops/latches can have asymmetric soft error rates when storing different logic values. This asymmetry can be used in conjunction with the different signal probabilities of registers in a design. In th...Show More
Dynamic power management has become essential for low power designs and systems. Whether intentionally or unintentionally, these power reduction techniques and corresponding management schemes can impact the hardware reliability and system resiliency in one way or the other. In this paper, we first examine the resiliency issues related to dynamically power-managed designs. Then we demonstrate how ...Show More
We propose and demonstrate in silicon simple, new circuit solutions using a standard 1-1-2 fin 0.09 um2 6T SRAM commercial bitcell in a 16nm FinFET CMOS process to enable a 400mV active VMIN, 200mV retention VMIN SRAM in a 64Kb CMOS array with 128b/BL. Active VMIN is enabled with a self-triggered feedback on an under-driven BL with faster and more robust signal development on the BL at lower volta...Show More
The access transistor of SRAM can suffer both positive bias temperature instability (PBTI) and hot carrier aging (HCA) during operation. The understanding of electron traps (ETs) is still incomplete and there is little information on their similarity and differences under these two stress modes. The key objective of this paper is to investigate ETs in terms of energy distribution, charging and dis...Show More
As CMOS scales down, hot carrier aging (HCA) scales up and can be a limiting aging process again. This has motivated re-visiting HCA, but recent works have focused on accelerated HCA by raising stress biases and there is little information on HCA under use-biases. Early works proposed that HCA mechanism under high and low biases are different, questioning if the high-bias data can be used for pred...Show More
System-on-chip designs must take into account a large number of sources of variability in order to be manufacturable with suitable yield. Resilient design must begin with careful attention to these methods, but must also move beyond them. This paper looks at the need to consider dynamic aging variation for BTI effects as part of an overall resilient design methodology. Note that this aging toleran...Show More
Hardware reliability has been a major concern for nano-scale computing systems. Different hardware design choices, application workloads and software management schemes can jointly affect the system's resilience. In this paper, we first develop a hardware evaluation platform based on an embedded/mobile development board and standard Linux kernel. We demonstrate the use of our platform to evaluate ...Show More
This paper investigates the use of combined read and write assist techniques to reduce the minimum operating voltage (VMIN) of the 6T SRAM bit-cell. While write failures initially limit VMIN, applying write assist introduces row and column half-select failures. Thus, read and write assist must be combined to allow scaling VMIN down to near/sub threshold voltages. We find that combining negative bi...Show More
In this paper we study the circuit design implications of Ge vs. Si PMOS FinFETs at the 10 and 7nm nodes, using TCAD calibrated statistical compact models and the ARM predictive benchmarking flow. The ARM predictive flow incorporates advanced-node-relevant layouts, design rules, parasitic RC extraction and wire-loading. We present the first comprehensive simulation study evaluating Ge pFinFETs in ...Show More
Radiation-induced soft errors have become a key challenge in advanced commercial electronic components and systems. We present the results of a soft error rate (SER) analysis of an embedded processor. Our SER analysis platform accurately models generation, propagation, and masking effects starting from a technology response model derived using TCAD simulations at the device level all the way to ap...Show More