Loading [MathJax]/extensions/MathMenu.js
Reg-PTQ: Regression-specialized Post-training Quantization for Fully Quantized Object Detector | IEEE Conference Publication | IEEE Xplore

Reg-PTQ: Regression-specialized Post-training Quantization for Fully Quantized Object Detector


Abstract:

Although deep learning based object detection is of great significance for various applications, it faces challenges when deployed on edge devices due to the computation ...Show More

Abstract:

Although deep learning based object detection is of great significance for various applications, it faces challenges when deployed on edge devices due to the computation and energy limitations. Post-training quantization (PTQ) can improve inference efficiency through integer computing. However, they suffer from severe performance degra-dation when performing full quantization due to overlooking the unique characteristics of regression tasks in ob-ject detection. In this paper, we are the first to explore regression-friendly quantization and conduct full quantization on various detectors. We reveal the intrinsic reason behind the difficulty of quantizing regressors with empir-ical and theoretical justifications, and introduce a novel Regression-specialized Post-Training Quantization (Reg- PTQ) scheme. It includes Filtered Global Loss Integration Calibration to combine the global loss with a two-step fil-tering mechanism, mitigating the adverse impact of false positive bounding boxes, and Learnable Logarithmic-Affine Quantizer tailored for the non-uniform distributed param-eters in regression structures. Extensive experiments on prevalent detectors showcase the effectiveness of the well-designed Reg-PTQ. Notably, our Reg-PTQ achieves 7.6x and 5.4x reduction in computation and storage consumption under INT4 with little performance degradation, which indicates the immense potential of fully quantized detectors in real-world object detection applications.
Date of Conference: 16-22 June 2024
Date Added to IEEE Xplore: 16 September 2024
ISBN Information:

ISSN Information:

Conference Location: Seattle, WA, USA

1. Introduction

Object detection [10], [11], [23], [42], [45], [57] is one of the most fundamental and challenging problems in computer vision. The current popular architectures, including convolution neural networks (CNNs) based [22], [46], [47], [50], [52], [53] and transformer-based [7], [14], [30], [33]–[35], [60] detection models, which are designed as powerful yet complex structures to deal with the detection of visual objects [61]. However, the existing detection models suffer from extremely high computational costs, making them infeasible to deploy on edge devices. This limits the broader application in practical scenarios. To mitigate this gap, several compression techniques [1], [15]–[17], [58] have been proposed to improve the efficiency of networks, among which quantization reduces the computational complexity and memory footprint by using lower bit-widths to represent network parameters. Post-training quantization (PTQ) is a widely used approach because of its wide versatility and low production cost, which directly applies quantization to a well-trained floating-point model without time-consuming retraining.

Comparison of FLOPs and parameters in full-precision and W 4A4 quantized detection models. Head structures take non-neglectable percentage of computation and memory. And quanti-zation significantly reduces the overall FLOPs and memory stor-age.

Contact IEEE to Subscribe

References

References is not available for this document.