Loading [MathJax]/extensions/MathZoom.js
End-to-end text recognition with convolutional neural networks | IEEE Conference Publication | IEEE Xplore

End-to-end text recognition with convolutional neural networks


Abstract:

Full end-to-end text recognition in natural images is a challenging problem that has received much attention recently. Traditional systems in this area have relied on ela...Show More

Abstract:

Full end-to-end text recognition in natural images is a challenging problem that has received much attention recently. Traditional systems in this area have relied on elaborate models incorporating carefully hand-engineered features or large amounts of prior knowledge. In this paper, we take a different route and combine the representational power of large, multilayer neural networks together with recent developments in unsupervised feature learning, which allows us to use a common framework to train highly-accurate text detector and character recognizer modules. Then, using only simple off-the-shelf methods, we integrate these two modules into a full end-to-end, lexicon-driven, scene text recognition system that achieves state-of-the-art performance on standard benchmarks, namely Street View Text and ICDAR 2003.
Date of Conference: 11-15 November 2012
Date Added to IEEE Xplore: 14 February 2013
ISBN Information:

ISSN Information:

Conference Location: Tsukuba, Japan
Citations are not available for this document.

1 Introduction

Extracting textual information from natural images is a challenging problem with many practical applications. Unlike character recognition for scanned documents, recognizing text in unconstrained images is complicated by a wide range of variations in backgrounds, textures, fonts, and lighting conditions. As a result, many text detection and recognition systems rely on cleverly hand-engineered features [5], [4], 1[4] to represent the underlying data. Sophisticated models such as conditional random fields [11], [19] or pictorial structures [18] are also often required to combine the raw detection/recognition outputs into a complete system.

Cites in Patents (9)Patent Links Provided by 1790 Analytics

1.
Chatterjee, Anirban; Majumder, Bodhisattwa Prasad; Pal, Gayatri; Bhat, Rajesh Shreedhar; Prabhu, Sumanth S.; Selvaraj, Vignesh, "Automated extraction of product attributes from images"
2.
Na, Hwidong, "Neural network method and apparatus"
3.
Wintz, Daniel Thomas; Li, Michael Lingzhi; Wolf, Elliott Gerard; Mawer, Chloe; Voegele, Caitlin; Fang, Zhou Daisy; Choudhury, Maya Ileana; Long, Julia; Thattai, Sudarsan; Rivera, Jeffrey Alvarez, "Optimizing pallet location in a warehouse"
4.
Zhang, Chengquan; Hu, Han; Luo, Yuxuan; Han, Junyu; Ding, Errui, "Character detection method and apparatus"
5.
Fergus, Robert D.; Bourdev, Lubomir; Paluri, Balamanohar; Sukhbaatar, Sainbayar, "Unsupervised training sets for content classification"
6.
Tran, Son Dinh; Manmatha, R., "Text recognition and localization with deep learning"
7.
Bhardwaj, Anurag; Lee, Chen-Yu; Piramuthu, Robinson; Jagadeesh, Vignesh; Di, Wei, "System and method for scene text recognition"
8.
LONG, Fei; ZHANG, Tao; CHEN, Zhijun, "METHOD FOR REGION EXTRACTION, METHOD FOR MODEL TRAINING, AND DEVICES THEREOF"
9.
Bhardwaj, Anurag; Lee, Chen-Yu; Piramuthu, Robinson; Jagadeesh, Vignesh; Di, Wei, "System and method for scene text recognition"
Contact IEEE to Subscribe

References

References is not available for this document.