I. Introduction
Recently, scene text detection has drawn great attention from computer vision and machine learning community. Driven by many content-based image applications such as photo translation and receipt content recognition, it has become a promising and challenging research area both in academia and industry. Detecting text in natural images is difficult, because both text and background may be complex in the wild and it often suffers from disturbance such as occlusion and uncontrollable lighting conditions [1].