Loading [a11y]/accessibility-menu.js
Kernel Proposal Network for Arbitrary Shape Text Detection | IEEE Journals & Magazine | IEEE Xplore

Kernel Proposal Network for Arbitrary Shape Text Detection


Abstract:

Segmentation-based methods have achieved great success for arbitrary shape text detection. However, separating neighboring text instances is still one of the most challen...Show More

Abstract:

Segmentation-based methods have achieved great success for arbitrary shape text detection. However, separating neighboring text instances is still one of the most challenging problems due to the complexity of texts in scene images. In this article, we propose an innovative kernel proposal network (dubbed KPN) for arbitrary shape text detection. The proposed KPN can separate neighboring text instances by classifying different texts into instance-independent feature maps, meanwhile avoiding the complex aggregation process existing in segmentation-based arbitrary shape text detection methods. To be concrete, our KPN will predict a Gaussian center map for each text image, which will be used to extract a series of candidate kernel proposals (i.e., dynamic convolution kernel) from the embedding feature maps according to their corresponding keypoint positions. To enforce the independence between kernel proposals, we propose a novel orthogonal learning loss (OLL) via orthogonal constraints. Specifically, our kernel proposals contain important self-information learned by network and location information by position embedding. Finally, kernel proposals will individually convolve all embedding feature maps for generating individual embedded maps of text instances. In this way, our KPN can effectively separate neighboring text instances and improve the robustness against unclear boundaries. To the best of our knowledge, our work is the first to introduce the dynamic convolution kernel strategy to efficiently and effectively tackle the adhesion problem of neighboring text instances in text detection. Experimental results on challenging datasets verify the impressive performance and efficiency of our method. The code and model are available at https://github.com/GXYM/KPN.
Published in: IEEE Transactions on Neural Networks and Learning Systems ( Volume: 34, Issue: 11, November 2023)
Page(s): 8731 - 8742
Date of Publication: 10 March 2022

ISSN Information:

PubMed ID: 35271451

Funding Agency:


I. Introduction

Scene text detection is a fundamental and critical task in computer vision because it is a key step in various text-related applications, including translation, text-visual question answering, text recognition, and text mining. With the rapid development of deep learning-based object detection [1]–[3] and segmentation [4], [5], scene text detection has witnessed great progress [6]–[8]. Arbitrary shape scene text detection, as one of the most challenging tasks in text detection, has attracted ever-increasing interest in both research and industrial communities. Except for the challenges existing in the general scene text detection tasks, arbitrary shape text detection should address additional challenging problems, such as varied scales, curved, and arbitrary shapes.

Contact IEEE to Subscribe

References

References is not available for this document.