Conferences >Proceedings of the 21st Inter...

End-to-end text recognition with convolutional neural networks

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Full end-to-end text recognition in natural images is a challenging problem that has received much attention recently. Traditional systems in this area have relied on ela...Show More

Metadata

Abstract:

Full end-to-end text recognition in natural images is a challenging problem that has received much attention recently. Traditional systems in this area have relied on elaborate models incorporating carefully hand-engineered features or large amounts of prior knowledge. In this paper, we take a different route and combine the representational power of large, multilayer neural networks together with recent developments in unsupervised feature learning, which allows us to use a common framework to train highly-accurate text detector and character recognizer modules. Then, using only simple off-the-shelf methods, we integrate these two modules into a full end-to-end, lexicon-driven, scene text recognition system that achieves state-of-the-art performance on standard benchmarks, namely Street View Text and ICDAR 2003.

Published in: Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012)

Date of Conference: 11-15 November 2012

Date Added to IEEE Xplore: 14 February 2013

ISBN Information:

ISSN Information:

Conference Location: Tsukuba, Japan

Citations are not available for this document.

Contents

1 Introduction

Extracting textual information from natural images is a challenging problem with many practical applications. Unlike character recognition for scanned documents, recognizing text in unconstrained images is complicated by a wide range of variations in backgrounds, textures, fonts, and lighting conditions. As a result, many text detection and recognition systems rely on cleverly hand-engineered features [5], [4], 1[4] to represent the underlying data. Sophisticated models such as conditional random fields [11], [19] or pictorial structures [18] are also often required to combine the raw detection/recognition outputs into a complete system.

Cites in Patents (9)Patent Links Provided by 1790 Analytics

Chatterjee, Anirban; Majumder, Bodhisattwa Prasad; Pal, Gayatri; Bhat, Rajesh Shreedhar; Prabhu, Sumanth S.; Selvaraj, Vignesh, "Automated extraction of product attributes from images"

Patent No. 11055557 Patent Office Google Scholar

Na, Hwidong, "Neural network method and apparatus"

Patent No. 10957309 Patent Office Google Scholar

Wintz, Daniel Thomas; Li, Michael Lingzhi; Wolf, Elliott Gerard; Mawer, Chloe; Voegele, Caitlin; Fang, Zhou Daisy; Choudhury, Maya Ileana; Long, Julia; Thattai, Sudarsan; Rivera, Jeffrey Alvarez, "Optimizing pallet location in a warehouse"

Patent No. 10796278 Patent Office Google Scholar