Loading [MathJax]/extensions/MathZoom.js
Sanjeev Satheesh - IEEE Xplore Author Profile

Showing 1-6 of 6 results

Filter Results

Show

Results

This paper describes a general, scalable, end-to-end framework that uses the generative adversarial network (GAN) objective to enable robust speech recognition. Encoders trained with the proposed approach enjoy improved invariance by learning to map noisy audio to the same embedding space as that of clean audio. Unlike previous methods, the new framework does not rely on domain expertise or strong...Show More
In this work, we perform an empirical comparison among the CTC, RNN-Transducer, and attention-based Seq2Seq models for end-to-end speech recognition. We show that, without any language model, Seq2Seq and RNN-Transducer models both outperform the best reported CTC models with a language model, on the popular Hub5'00 benchmark. On our internal diverse dataset, these trends continue — RNN-Transducer ...Show More
Cloud computing is a key computing platform for sharing resources that includes IaaS, Saas, PaaS and business process. This provides many benefits for the users to create and store health care data on the cloud thereby utilizing fewer resources in the client system. The proposed system mainly focuses on health care data security. To reduce the increase in cost of hospitalization, researchers are b...Show More
Reading text from photographs is a challenging problem that has received a significant amount of attention. Two key components of most systems are (i) text detection from images and (ii) character recognition, and many recent methods have been proposed to design better feature representations and models for both. In this paper, we apply methods recently developed in machine learning -- specificall...Show More
Video search today uses the metadata surrounding the video, ignoring its semantic content. Over the years, a lot of research has gone into indexing and browsing of sports video content. In this work, we present a novel approach for classification of events in cricket videos and thus, summarize its visual content. The proposed method segments a cricket video into shots and identifies the visual con...Show More
A photomosaic is an image assembled from smaller images called tiles. When a photomosaic is viewed from a distance, it resembles a desired target image. The process of photomosaic generation can be viewed as an optimization problem, where a set of tiles needs to be arranged to resemble a target image. We impose a constraint on the number of times a tile image can be repeated in a photomosaic. A ra...Show More