Loading [MathJax]/extensions/MathZoom.js
Xiaojun Wu - IEEE Xplore Author Profile

Showing 1-25 of 30 results

Filter Results

Show

Results

Current sketch extraction methods either require extensive training or fail to capture a wide range of artistic styles, limiting their practical applicability and versatility. We introduce Mixture-of-Self-Attention (MixSA), a training-free sketch extraction method that leverages strong diffusion priors for enhanced sketch perception. At its core, MixSA employs a mixture-of-self-attention technique...Show More
Graph convolutional networks that leverage spatial-temporal information from skeletal data have emerged as a popular approach for 3D human pose estimation. However, comprehensively modeling consistent spatial-temporal dependencies among the body joints remains a challenging task. Current approaches are limited by performing graph convolutions solely on immediate neighbors, deploying separate spati...Show More
Font generation is a challenging research topic in the field of computer vision, aiming to apply the style from a reference image to a source character image. The core task of font generation is to generate images, especially complex character images, that closely resemble the style of the reference font while maintaining the original glyph structure. This paper proposes a novel font generation me...Show More
The performance of music source separation (MSS) has been greatly improved in recent years due to the rapid development of various neural network architectures. Spectrogram features are widely used in MSS tasks which consists of both time and frequency information. However, the time and frequency correlations and the local patterns of spectrogram have not been fully explored. In this paper, we pro...Show More
The uniqueness of folk song expression forms and complex semantic relationships, and the existing models are not good at recognizing the emotion category of folk song lyrics, so an attention TIG-CNN-BiGRU based emotion classification model is proposed. That is, we first propose a text vector representation based on TF-IDF feature weighted pretrained Glove word embedding to extract important featur...Show More
A dataset was constructed for intelligent generation of Chinese landscape paintings based on deep learning, which provides the scientific assistance for inheriting the traditional Chinese culture. This dataset is originated from the two of top ten paintings in ancient China, i.e., “Dwelling in the Fuchun Mountains” with ink wash style and “A Thousand Li of Rivers and Mountains” with style of blue ...Show More
In online social networks, information diffusion presents a complicated dynamic process that accompanies users' lives, in which the link between multi-information symbiosis and conflict is frequently overlooked. We investigate two aspects that influence the process of diffusion by examining the phenomena of information dissemination between individuals and the general environment in online social ...Show More
Quantitative assessment of human pose quality is becoming more and more important in various real-world applications. This paper presents a novel optimal sub-pattern assignment based human pose assessment (OSPA-HPA) algorithm for automatically quantifying how well people perform poses. We model the human pose as a collection of finite sets of features. Then, the pose quality is measured using the ...Show More
Using various deep learning networks to generate music is a research hotspot in the field of human intelligence. Due to the limitation of network structure in the existing melody generation models, the quality of melody generation is poor. In this paper, for folk melody, we propose a melody generation network based on CNN-BiGRU and Self-Attention. First, the processed data is input into the CNN mo...Show More
Chinese characters have distilled the Chinese nation's vast wisdom and values, but the general public's learning and enjoyment of ancient scripts are hampered by the fact that fonts from different dynasties have highly different styles, intricate structures, and diverse deformations. To solve the difficulty of ordinary people identifying ancient Chinese characters, an ancient font recognition syst...Show More
Recent researches have proven that deep denoising autoencoder is an effective method for noise reduction and speech enhancement, and can provide better performance than several existing methods. However, training deep denoising autoencoder has proven to be difficult computationally. The goal of this study is to develop a modular approach for training deep denoising autoencoders as a set of classif...Show More
Joint extraction of entities and relations is an important task for building a knowledge graph and information extraction. However, small interference in the text can greatly change the semantics of words and sentences in natural language, thereby affecting the model’s prediction results. To solve the sensitivity of text to noise, in this paper, we present a method, called BERT of Adversarial Trai...Show More
This paper presents a novel information divergence based multisensor selection approach for multitarget tracking. Multitarget states are modelled as multi-Bernoulli random finite sets and multisensor selection is studied under the partially observed Markov decision process (POMDP) framework. Multisensor selection under POMDP is essentially a global combinatorial optimization problem that is challe...Show More
With the rapid development of wireless sensing technology and the popularity of the internet of things, applications based on wireless sensor networks have been applied universally in many scenarios. Sweep coverage is an emerging technique to improve the quality of service by promoting the energy efficiency of the sensors. To solve the distance-sensitive-route scheduling problem, which considering...Show More
Speech enhancement techniques in hearing applications aimed to improve the quality of speech in a noisy environment. Deep denoising autoencoder suppresses noise from noise corrupted speech efficiently. Unfortunately, previous applications provide only limited benefits for the enhancement of speech in noisy environments. This paper presents a new approach for the hearing application, which indicate...Show More
In order to overcome some problems caused by improper parameters selection when applying Least mean square (LMS), Normalized LMS (NLMS) or Recursive least square (RLS) algorithms to estimate co-efficients of second-order Volterra filter, a novel Davidon-Fletcher-Powell-based Second-order Volterra filter (DFP-SOVF) is proposed. Analysis of computational complexity and stability are presented. Simul...Show More
Image stitching can be used to in 3D reconstruction to obtain the comprehensive obstacle information, which plays an important role in the field of mobile robots. However, previous algorithms have two problems: 1. The linear structure of the image might have been corrupted. 2. Some inconsistency may exist in the transitional region of the stitched image. In order to solve above problems, in this p...Show More
Recently, based on novel convolutional neural net-work architectures proposed, tremendous advances have been achieved in image denoising task. An effective and efficient multi-level network architecture for image denoising refers to restore the latent clean image from a coarser scale to finer scales and pass features through multiple levels of the model. Unfortunately, the bottleneck of applying m...Show More
The multi-object tracking (MOT) algorithms based on tracking by detection framework are the state-of-the-art trackers in recent years. Association optimization and association affinity model are two key parts in MOT, which have attracted attention to build effective association model to overcome ambiguous detection responses. In this paper, we have proposed an online multi-pedestrian tracking algo...Show More
Characterized by the ability to handle varying number of objects, tracking by detection framework becomes increasingly popular in multiobject tracking (MOT) problem. However, the tracking performance heavily depends on the object detector. Considering that data association optimization and association affinity model are two key parts in MOT, an online multipedestrian tracking method is proposed to...Show More
In order to solve parameters selection problem when applying recursive least square (RLS), least mean square (LMS) or normalized LMS (NLMS) algorithms to estimate kernels of second-order Volterra filter (SOVF), a novel adaptive gbest-guide artificial bee colony (AGABC) optimization algorithm is used to derive kernels of Volterra, that is a type of the AGABC-SOVF prediction model with an explicit c...Show More
Speech signals are nonlinear chaotic time series. This paper proposes a novel speech signal nonlinear prediction model with the hidden phase space reconstruction method. The parameters, embedding dimension $m$. time delay $\tau$ and model structure are solved simultaneously, breaking the restriction of phase space, which needs to be reconstructed before modeling for the existing prediction method....Show More
With the rapid rise of computer vision and driverless technology, vehicle model recognition plays a huge role in the common application and industry field. While fine-grained vehicle model recognition is often influenced by multi-level information, such as the image perspective, inter-feature similarity, vehicle details. Furthermore, pivotal regions extraction and fine-grained feature learning hav...Show More
In this paper, we propose a novel network for super-resolution and achieve the state-of-the-art performance with limited parameters. Inspired by the previous methods, we use ResNet to learn the residual part of the input patches. In addition, we introduce an inception-like structure that helps to extract features and a weight sharing mechanism is utilized among these inception blocks. By cascading...Show More
Single image super resolution has achieved a significant breakthrough with the development of deep learning technology. Among these approaches based on deep learning, the mainstream method is to build a cascading network and attempt to add more learning layers. However, as the depth of the model increases, features far away from the reconstruction layer are less considered in the reconstruction pr...Show More