Loading [MathJax]/extensions/MathMenu.js
Caiyan Jia - IEEE Xplore Author Profile

Showing 1-19 of 19 results

Filter Results

Show

Results

Scene text recognition (STR) methods have struggled to attain high accuracy and fast inference speed. Auto-Regressive (AR)-based models implement the recognition in a character-by-character manner, showing superiority in accuracy but with slow inference speed. Alternatively, Parallel Decoding (PD)-based models infer all characters in a single decoding pass, offering faster inference speed but gene...Show More
Multi-modal models have shown appealing performance in visual recognition tasks, as free-form text-guided training evokes the ability to understand fine-grained visual content. However, current models cannot be trivially applied to scene text recognition (STR) due to the compositional difference between natural and text images. We propose a novel instruction-guided scene text recognition (IGTR) pa...Show More
LiDAR-based sparse 3-D object detection plays a crucial role in autonomous driving applications due to its computational efficiency advantages. Existing methods either use the features of a single central voxel as an object proxy or treat an aggregated cluster of foreground points as an object proxy. However, the former cannot aggregate contextual information, resulting in insufficient information...Show More
In the realm of modern autonomous driving, the perception system is indispensable for accurately assessing the state of the surrounding environment, thereby enabling informed prediction and planning. The key step to this system is related to 3D object detection that utilizes vehicle-mounted sensors such as LiDAR and cameras to identify the size, the category, and the location of nearby objects. De...Show More
Pathological virtual re-staining is a valuable research topic in AI-aided diagnosis, as it reduces the need for costly and time-consuming physical staining. However, existing methods still suffer from the insufficient ability to preserve tissue microstructure and cellular details, making the generated images less convincing. In this paper, we propose a CycleGAN-based dual contrastive learning re-s...Show More
LiDAR and camera are complementary sensors for 3D object detection in autonomous driving. However, it is challenging to explore the unnatural interaction between point clouds and images, and the critical factor is how to conduct feature alignment of these heterogeneous modalities. Currently, many methods achieve feature alignment through projection calibration, without accounting for the impact of...Show More
LiDAR and cameras are complementary sensors for 3D object detection in autonomous driving. However, it is challenging to explore the unnatural interaction between point clouds and images, and the critical factor is how to conduct feature alignment of heterogeneous modalities. Currently, many methods achieve feature alignment by projection calibration only, without considering the problem of coordi...Show More
Light detection and ranging (LiDAR)–camera fusion can enhance the performance of 3-D object detection by utilizing complementary information between depth-aware LiDAR points and semantically rich images. Existing voxel-based methods face significant challenges when fusing sparse voxel features with dense image features in a one-to-one manner, resulting in the loss of the advantages of images, incl...Show More
The 3-D object detection with light detection and ranging (LiDAR) point clouds is a challenging problem, which requires 3-D scene understanding, yet this task is critical to autonomous driving. Existing voxel-based 3-D object detectors are becoming increasingly popular but have several shortcomings. For example, during voxelization, features of distant sparse point clouds are largely discarded, wh...Show More
Autonomous vehicles require constant environmental perception to obtain the distribution of obstacles to achieve safe driving. Specifically, 3D object detection is a vital functional module as it can simultaneously predict surrounding objects' categories, locations, and sizes. Generally, autonomous vehicles are equipped with multiple sensors, including cameras and LiDARs. The fact that single-moda...Show More
Ships in remote sensing images are usually arranged in arbitrary direction, small in size, and densely arranged. As a result, existing object detection algorithms cannot detect ships quickly and accurately. In order to solve the above problems, a lightweight object detection network for fast detection of ships is proposed. The network is composed of backbone network, four-scale fusion network and ...Show More
Variational autoencoder (VAE) is considered as an emerging model for ensuring competitive performance in recom-mender systems. However, its performance is severely limited by the amount of training examples and, as a result, existing VAE models may fail to provide satisfactory recommendation results in presence of highly sparse user-item interactions. In this paper, we propose a self-supervised VA...Show More
Low-cost live broadcasting is highly anticipated for soccer matches that involve heavy expenses in human and equipment currently. This demo showcases the autoSoccer system to approach the target. It takes a fix-view panoramic soccer video as the input and automatically generates its live broadcasting, which continuously focuses on the most interesting area in the soccer field. To this end, a novel...Show More
As one of the most active research fields in computer vision, the performance of object detection has been boosted by recent advancement of deep learning. However, performance improvements are often accompany with more resource consumption, which in turn restricts the application. There has been a rising interest in building object detectors that run well on embedded systems, i.e., with a better t...Show More
Attributed graphs have attracted much attention in recent years. Different from conventional graphs, attributed graphs involve two different types of heterogeneous information, i.e., structural information, which represents the links between the nodes, and attribute information on each of the nodes. Clustering on attributed graphs usually requires the fusion of both types of information in order t...Show More
Underwater coral reef fish detection is topic receiving increasingly attention due to its importance in various applications like fish biodiversity monitoring, marine resource managements, etc. However, compared with studies on generic object detection, existing methods on this task are not mature so far where advanced deep models and technologies are seldom considered. This paper presents FFDet, ...Show More
Community structure is one of the most important properties for understanding the topology and function of a complex network. Recently, the rank reduction technique, non-negative matrix factorization (NMF), has been successfully used to uncover communities in complex networks. In the machine learning literature, the algorithm Alternating Constraint Least Squares (ACLS) is developed to perform NMF ...Show More
Network modeling and analysis have been developed as one of the promising approaches for exploring the regularities behind the phenomena of complex organization and interactions in many significant fields. Traditional Chinese medicine (TCM) is a kind of holistic medical science, usually in whose clinical setting herb prescriptions consisting of several distinct herbs were used for individualized p...Show More
K-means clustering is widely used due to its fast convergence, but it is sensitive to the initial condition.Therefore, many methods of initializing K-means clustering have been proposed in the literatures. Compared with Kmeans clustering, a novel clustering algorithm called affinity propagation (AP clustering) has been developed by Frey and Dueck, which can produce a good set of cluster exemplars ...Show More