Loading [MathJax]/extensions/MathZoom.js
Binarization of degraded handwritten documents based on morphological contrast intensification | IEEE Conference Publication | IEEE Xplore

Binarization of degraded handwritten documents based on morphological contrast intensification


Abstract:

Degraded handwritten document images pose several challenges such as faint characters, bleeding-through and large background ink stains for binarization. Traditional bina...Show More

Abstract:

Degraded handwritten document images pose several challenges such as faint characters, bleeding-through and large background ink stains for binarization. Traditional binarization techniques fail to handle all these degradations and related problems efficiently. In this paper we present a hybrid binarization technique based on morphological contrast intensification to set up a global threshold for segmentation of candidate text regions from the degraded document images. The proposed approach uses grayscale morphological tools to estimate the background of the image. Using the estimated background information the contrast of the text regions of the document is increased. The histogram of the contrast image is analyzed to obtain a threshold value for initial segmentation of text regions. Finally, the local threshold technique is used to get the final binarized image. The efficacy and accuracy of the proposed technique are also compared, using DIBCO (Document Image Binarization Contest) test dataset (2010, 2011, 2012 and 2013) and evaluation parameters, with other algorithms already reported in the literature.
Date of Conference: 21-24 December 2015
Date Added to IEEE Xplore: 25 February 2016
ISBN Information:
Conference Location: Waknaghat, India
References is not available for this document.

I. Introduction

A lot of initiative is taken now, both in academia and industry, not only to archive the historical documents, but also processing the same for classifying, indexing, searching, extraction of text and images, contents based retrieval, etc. for dissemination of this wealth of information on human history across the globe. The first task towards the processing of the documents starts usually with binarization of the scanned image as a pre-processing step. The objective of the binarization is to clearly divide the pixels into two classes, namely, the foreground and the background. The accuracy of binarization is pivotal for the success of the subsequent steps for further processing. Handwritten documents, particularly the historical documents are difficult to binarize due to lack of standardization of the input and degradation of all sorts due to ageing and other factors. In historical handwritten documents the degradation includes faint characters, bleeding through and ink stains. Some noise may also be introduced in the scanning process in the form of black patches or back impressions. Fig. 1 shows these characteristics which are big challenges for any technique use for binarization of degraded historical documents.

Select All
1.
N. Otsu, "A threshold selection method from gray-level histograms Systems Man and Cybernetics", IEEE Transactions, vol. 9, no. 1, pp. 62-66, 1979.
2.
W. Niblack, An introduction to image processing, englewood cliffs:prenticehall, 1986.
3.
I.H. Kawano, K. Oohama, H. Maeda, Y. Okada and N. Ikoma, "Degraded document image binarization combining local statistics", Proc. Of ICCAS-SICE, pp. 439-443, 2009.
4.
J. Sauvola and M. Pietikinen, "Adaptive document image binarization", Pattern Recognition, vol. 33, no. 2, pp. 225-232, 2000.
5.
S. Shaikh, A. Maiti and N. Chak, "A new image binarization method using iterative partitioning", Machine Vision and Applications, vol. 24, no. 2, pp. 337-350, 2013.
6.
T. Y. Kuo, Y. Y. Lai and Y. C. Lo, "A novel image binarization method using hybrid thresholding", Proc. of Multimedia and Expo (ICME), pp. 608-612, 2010.
7.
B. Su, S. Lu and C. Tan, "Combination of document image binarization techniques", Proc. of Document Analysis and Recognition (ICDAR), pp. 22-26, 2011.
8.
M. Zayed, A. Ouari, M. Derraschouk and Y. Chibani, "An effective hybrid thresholding technique for degraded documents images binarization", Proc. of ICITST, pp. 460-465, 2011.
9.
B. Gatos, I. Pratikakis and S.J Perantonis, "Adaptive degraded document image binarization", Pattern Recognition, vol. 33, no. 3, pp. 317-327, 2006.
10.
S. Lu, B. Su and C. L. Tan, "Document image binarization using background estimation and stroke edges", IJDAR, vol. 13, no. 4, pp. 303-314, 2010.
11.
I. K. Kim, D. W. Jung and R. H Park, "Document image binarization based on topographic analysis using a water flow model", Pattern Recognition, vol. 35, no. 1, pp. 265-277, 2002.
12.
J. G. Kuk and N. I. Cho, "Feature based binarization of document images degraded by uneven light condition", Proc. ICDAR, pp. 748-752, 2009.
13.
T. Le, T. Bui and C. Suen, "Ternary entropy-based binarization of degraded document images using morphological operators", Proc. ICDAR, pp. 114-118, 2011.
14.
T. Li-Jing, C. Kan, Z. Yan, F Xiao-ling and D. Jian-Yong, "Document image binarization based on nfcm", Proc. Image and Signal Processing, pp. 1-5, 2009.
15.
K. Ntirogiannis, B. Gatos and I. Pratikakis, "A combined approach for the binarization of handwritten document images", Pattern Recognition Letters, vol. 35, pp. 3-15, 2012.
16.
B. Gatos, K. Ntirogiannis and I. Pratikakis, "ICDAR 2009 document image binarization contest (DIBCO 2009)", Proc. ICDAR, pp. 1375-1382, 2009.
17.
H Lu, A. C. Kot and Y. Q. Shi, "Distance-Reciprocal Distortion Measure for Binary Document Images", IEEE Signal Processing Letters, vol. 11, no. 2, pp. 228-231, 2004.
18.
B. Su, S. Lu and C. L. Tan, "A Robust Document Image Binarization for Degraded Document Images", IEEE Transactions on Image Processing, vol. 22, no. 4, pp. 1408-1417, April 2013.

Contact IEEE to Subscribe

References

References is not available for this document.