Loading [MathJax]/extensions/MathMenu.js
Enhancing training data for handwriting recognition of whiteboard notes with samples from a different database | IEEE Conference Publication | IEEE Xplore

Enhancing training data for handwriting recognition of whiteboard notes with samples from a different database


Abstract:

Recognition of unconstrained handwritten text is still a challenge. In this paper we consider a new problem, which is the recognition of notes written on a whiteboard. Ou...Show More

Abstract:

Recognition of unconstrained handwritten text is still a challenge. In this paper we consider a new problem, which is the recognition of notes written on a whiteboard. Our recognizer is based on hidden Markov models (HMMs). As it is difficult to acquire sufficient amounts of training data for the HMMs we propose two strategies for enlarging the training set. Both strategies are based on an existing database of offline handwritten text, which includes handwriting samples different from whiteboard data. The two proposed strategies are MAP adaptation and merging of training sets. With these methods we can achieve improvements of the word recognition rate of up to 5.7%.
Date of Conference: 31 August 2005 - 01 September 2005
Date Added to IEEE Xplore: 16 January 2006
Print ISBN:0-7695-2420-6

ISSN Information:

Conference Location: Seoul, Korea (South)

1. Introduction

In this paper we describe continuation of our research on a novel handwriting recognition task, which is the recognition of text written on a whiteboard. Our recognition system for this task has been introduced in [8], where a writer independent handwritten sentence recognizer based on HMMs was presented. The performance of this recognizer was only about 64.27% on the word level. The main reason for the low performance is that the number of writers in the training set is very small. The data set of all available whiteboard recordings (training, test and validation set) consists of only about 6,000 words rendered by a total of 20 writers. We expect to get a better recognition performance if we enlarge this data set. However, it is rather difficult to significantly enlarge the existing database, because the whiteboard is not portable and can be used by only a single writer at a time. For this reason we propose another approach in this paper, where we use data from a large existing database of off-line handwritten sentences [10] to augment the training set.

Contact IEEE to Subscribe

References

References is not available for this document.