Word Level Font-to-Font Image Translation using Convolutional Recurrent Generative Adversarial Networks | IEEE Conference Publication | IEEE Xplore

Word Level Font-to-Font Image Translation using Convolutional Recurrent Generative Adversarial Networks


Abstract:

Conversion of one font to another font is very useful in real life applications. In this paper, we propose a Convolutional Recurrent Generative model to solve the word le...Show More

Abstract:

Conversion of one font to another font is very useful in real life applications. In this paper, we propose a Convolutional Recurrent Generative model to solve the word level font transfer problem. Our network is able to convert the font style of any printed text images from its current font to the required font. The network is trained end-to-end for the complete word images. Thus it eliminates the necessary pre-processing steps, like character segmentations. We extend our model to conditional setting that helps to learn one-to-many mapping function. We employ a novel convolutional recurrent model architecture in the Generator that efficiently deals with the word images of arbitrary width. It also helps to maintain the consistency of the final images after concatenating the generated image patches of target font. Besides, the Generator and the Discriminator network, we employ a Classification network to classify the generated word images of converted font style to their subsequent font categories. Most of the earlier works related to image translation are performed on square images. Our proposed architecture is the first of its kind which can handle images of varying widths. Word images generally have varying width depending on the number of characters present. Hence, we test our model on a synthetically generated font dataset. We compare our method with some of the state-of-the-art methods for image translation. The superior performance of our network on the same dataset proves the ability of our model to learn the font distributions.
Date of Conference: 20-24 August 2018
Date Added to IEEE Xplore: 29 November 2018
ISBN Information:
Print on Demand(PoD) ISSN: 1051-4651
Conference Location: Beijing, China

I. Introduction

With tremendous advancement in technology and multimedia, images and videos become sole part of our day to day life activities. So, these available images and videos become a subject of study. Image processing, computer vision and computer graphics areas deal with study of images and videos. Various works have been done in the field of image retrieval, image classification but comparatively very few works have been done in order to perform image to image translation. Isola et al. [1] proposed a novel method of such image to image translation using Conditional Generative Adversarial Networks (cGANs). In our proposed method, we have extended the task of image to image translation in Document Image Analysis (DIA) domain. Our work focuses on font to font translation of document images of printed words. To the best of our knowledge, no other previous work tried to device any method for word level font to font translation. Using our method, images of printed words written in one particular font, can be easily transformed to any other commonly used font. This makes it very useful for editing purpose with no need of soft copy. One can edit by directly taking the photograph and changing the font without re-editing them, thus saving sufficient time and effort. Sometimes fonts of old books or manuscripts become fainted, which make them difficult to understand. These fonts can be given a new fresh look, easing readability. Font-to-font translation can also be applied to graphic designs. Other useful aspects of this novel approach include designing of cover pages of magazines or books, with the advantage that different fonts can be tried without having several softcopies of the background. Fig. 1 illustrates the font-to-font translation problem.

Example showing the font-to-font translation

Contact IEEE to Subscribe

References

References is not available for this document.