I. Introduction
Scene text detection is a fundamental and critical task in computer vision because it is a key step in various text-related applications, including translation, text-visual question answering, text recognition, and text mining. With the rapid development of deep learning-based object detection [1]–[3] and segmentation [4], [5], scene text detection has witnessed great progress [6]–[8]. Arbitrary shape scene text detection, as one of the most challenging tasks in text detection, has attracted ever-increasing interest in both research and industrial communities. Except for the challenges existing in the general scene text detection tasks, arbitrary shape text detection should address additional challenging problems, such as varied scales, curved, and arbitrary shapes.