Loading [MathJax]/extensions/MathZoom.js
Binarization of degraded handwritten documents based on morphological contrast intensification | IEEE Conference Publication | IEEE Xplore

Binarization of degraded handwritten documents based on morphological contrast intensification


Abstract:

Degraded handwritten document images pose several challenges such as faint characters, bleeding-through and large background ink stains for binarization. Traditional bina...Show More

Abstract:

Degraded handwritten document images pose several challenges such as faint characters, bleeding-through and large background ink stains for binarization. Traditional binarization techniques fail to handle all these degradations and related problems efficiently. In this paper we present a hybrid binarization technique based on morphological contrast intensification to set up a global threshold for segmentation of candidate text regions from the degraded document images. The proposed approach uses grayscale morphological tools to estimate the background of the image. Using the estimated background information the contrast of the text regions of the document is increased. The histogram of the contrast image is analyzed to obtain a threshold value for initial segmentation of text regions. Finally, the local threshold technique is used to get the final binarized image. The efficacy and accuracy of the proposed technique are also compared, using DIBCO (Document Image Binarization Contest) test dataset (2010, 2011, 2012 and 2013) and evaluation parameters, with other algorithms already reported in the literature.
Date of Conference: 21-24 December 2015
Date Added to IEEE Xplore: 25 February 2016
ISBN Information:
Conference Location: Waknaghat, India

I. Introduction

A lot of initiative is taken now, both in academia and industry, not only to archive the historical documents, but also processing the same for classifying, indexing, searching, extraction of text and images, contents based retrieval, etc. for dissemination of this wealth of information on human history across the globe. The first task towards the processing of the documents starts usually with binarization of the scanned image as a pre-processing step. The objective of the binarization is to clearly divide the pixels into two classes, namely, the foreground and the background. The accuracy of binarization is pivotal for the success of the subsequent steps for further processing. Handwritten documents, particularly the historical documents are difficult to binarize due to lack of standardization of the input and degradation of all sorts due to ageing and other factors. In historical handwritten documents the degradation includes faint characters, bleeding through and ink stains. Some noise may also be introduced in the scanning process in the form of black patches or back impressions. Fig. 1 shows these characteristics which are big challenges for any technique use for binarization of degraded historical documents.

Contact IEEE to Subscribe

References

References is not available for this document.