Ying Cheng - IEEE Xplore Author Profile

Showing 1-4 of 4 results

Filter Results

Show

Results

Multimodal fusion offers significant potential for enhancing medical diagnosis, particularly in the Intensive Care Unit (ICU), where integrating diverse data sources is crucial. Traditional static fusion models often fail to account for sample-wise variations in modality importance, which can impact prediction accuracy. To address this issue, we propose a dynamic Uncertainty-Aware Weighting (UAW) ...Show More
Although audio-visual representation has been proven to be applicable in many downstream tasks, the representation of dancing videos, which is more specific and always accompanied by music with complex auditory contents, remains challenging and uninvestigated. Considering the intrinsic alignment between the cadent movement of the dancer and music rhythm, we introduce MuDaR, a novel Music-Dance Rep...Show More
Visual-only self-supervised learning has achieved significant improvement in video representation learning. Existing related methods encourage models to learn video representations by utilizing contrastive learning or designing specific pretext tasks. However, some models are likely to focus on the background, which is unimportant for learning video representations. To alleviate this problem, we p...Show More
Audio-visual event localization aims to localize an event that is both audible and visible in the wild, which is a widespread audio-visual scene analysis task for unconstrained videos. To address this task, we propose a Multimodal Parallel Network (MPN), which can perceive global semantics and unmixed local information parallelly. Specifically, our MPN framework consists of a classification subnet...Show More