Loading [MathJax]/extensions/MathZoom.js
Yun Liu - IEEE Xplore Author Profile

Showing 1-25 of 40 results

Filter Results

Show

Results

Transformer-based methods have exhibited remarkable potential in single image super-resolution (SISR) by effectively extracting long-range dependencies. However, most of the current research in this area has prioritized the design of transformer blocks to capture global information, while overlooking the importance of incorporating high-frequency priors, which we believe could be beneficial. In ou...Show More
Non-maximum suppression (NMS) is an essential post-processing step for object detection. The de-facto standard for NMS, namely GreedyNMS, is not parallelizable and could thus be the performance bottleneck in object detection pipelines. MaxpoolNMS is introduced as a fast and parallelizable alternative to GreedyNMS. However, MaxpoolNMS is only capable of replacing the GreedyNMS at the first stage of...Show More
Many point-based semantic segmentation methods have been designed for indoor scenarios, but they struggle if they are applied to point clouds that are captured by a light detection and ranging (LiDAR) sensor in an outdoor environment. In order to make these methods more efficient and robust such that they can handle LiDAR data, we introduce the general concept of reformulating 3-D point-based oper...Show More
Intuitively, global video context could benefit video semantic segmentation (VSS) if it is designed to simultaneously model global temporal and spatial dependencies for a holistic understanding of the semantic scenes in a video clip. However, we found that the existing VSS approaches focus only on modeling local video context. This paper attempts to bridge this gap by learning global video context...Show More
Contextual information plays a core role for video semantic segmentation (VSS). This paper summarizes contexts for VSS in two-fold: local temporal contexts (LTC) which define the contexts from neighboring frames, and global temporal contexts (GTC) which represent the contexts from the whole video. As for LTC, it includes static and motional contexts, corresponding to static and moving content in n...Show More
Precise and rapid delineation of sharp boundaries and robust semantics is essential for numerous downstream robotic tasks, such as robot grasping and manipulation, real-time semantic mapping, and online sensor calibration performed on edge computing units. Although boundary detection and semantic segmentation are complementary tasks, most studies focus on lightweight models for semantic segmentati...Show More
Tuberculosis (TB) is a major global health threat, causing millions of deaths annually. Although early diagnosis and treatment can greatly improve the chances of survival, it remains a major challenge, especially in developing countries. Recently, computer-aided tuberculosis diagnosis (CTD) using deep learning has shown promise, but progress is hindered by limited training data. To address this, w...Show More
Existing salient object detection (SOD) methods mainly rely on U-shaped convolution neural networks (CNNs) with skip connections to combine the global contexts and local spatial details that are crucial for locating salient objects and refining object details, respectively. Despite great successes, the ability of CNNs in learning global contexts is limited. Recently, the vision transformer has ach...Show More
The rapid spread of the new pandemic, i.e., coronavirus disease 2019 (COVID-19), has severely threatened global health. Deep-learning-based computer-aided screening, e.g., COVID-19 infected area segmentation from computed tomography (CT) image, has attracted much attention by serving as an adjunct to increase the accuracy of COVID-19 screening and clinical diagnosis. Although lesion segmentation i...Show More
Transformer-based methods have exhibited remarkable potential in single image super-resolution (SISR) by effectively extracting long-range dependencies. However, most of the current research in this area has prioritized the design of transformer blocks to capture global information, while overlooking the importance of incorporating high-frequency priors, which we believe could be beneficial. In ou...Show More
Recently, indiscernible scene understanding has attracted a lot of attention in the vision community. We further advance the frontier of this field by systematically studying a new challenge named indiscernible object counting (IOC), the goal of which is to count objects that are blended with respect to their surroundings. Due to a lack of appropriate IOC datasets, we present a large-scale dataset...Show More
The aim of this paper is to propose a large scale dataset for image restoration (LSDIR). Recent work in image restoration has been focused on the design of deep neural networks. The datasets used to train these networks ‘only’ contain some thousands of images, which is still incomparable with the large scale datasets for other vision tasks such as visual recognition and object detection. The small...Show More
Recently, the vision transformer has achieved great success by pushing the state-of-the-art of various vision tasks. One of the most challenging problems in the vision transformer is that the large sequence length of image tokens leads to high computational cost (quadratic complexity). A popular solution to this problem is to use a single pooling operation to reduce the sequence length. This paper...Show More
Recent progress in salient object detection (SOD) mainly depends on the Atrous Spatial Pyramid Pooling (ASPP) module for multi-scale learning. Intuitively, different input images, different pixels, and different network layers may have different preferences for various feature scales. However, ASPP treats all feature scales as equally important by a simple sum operation. To this end, we propose At...Show More
With the success of deep learning in classifying short trimmed videos, more attention has been focused on temporally segmenting and classifying activities in long untrimmed videos. State-of-the-art approaches for action segmentation utilize several layers of temporal convolution and temporal pooling. Despite the capabilities of these approaches in capturing temporal dependencies, their predictions...Show More
The contextual information plays a core role in semantic segmentation. As for video semantic segmentation, the contexts include static contexts and motional contexts, corresponding to static content and moving content in a video clip, respectively. The static contexts are well exploited in image semantic segmentation by learning multi-scale and global/long-range features. The motional contexts are...Show More
Recent progress on salient object detection (SOD) mainly benefits from multi-scale learning, where the high-level and low-level features collaborate in locating salient objects and discovering fine details, respectively. However, most efforts are devoted to low-level feature learning by fusing multi-scale features or enhancing boundary representations. High-level features, which although have long...Show More
The high computational cost of neural networks has prevented recent successes in RGB-D salient object detection (SOD) from benefiting real-world applications. Hence, this article introduces a novel network, MobileSal, which focuses on efficient RGB-D SOD using mobile networks for deep feature extraction. However, mobile networks are less powerful in feature representation than cumbersome networks....Show More
Real-time semantic segmentation of LiDAR data is crucial for autonomously driving vehicles and robots, which are usually equipped with an embedded platform and have limited computational resources. Approaches that operate directly on the point cloud use complex spatial aggregation operations, which are very expensive and difficult to deploy on embedded platforms. As an alternative, projection-base...Show More
Recent progress on salient object detection mainly aims at exploiting how to effectively integrate multiscale convolutional features in convolutional neural networks (CNNs). Many popular methods impose deep supervision to perform side-output predictions that are linearly aggregated for final saliency prediction. In this article, we theoretically and experimentally demonstrate that linear aggregati...Show More
Weakly supervised semantic instance segmentation with only image-level supervision, instead of relying on expensive pixel-wise masks or bounding box annotations, is an important problem to alleviate the data-hungry nature of deep learning. In this article, we tackle this challenging problem by aggregating the image-level information of all training images into a large knowledge graph and exploitin...Show More
Differentiable Architecture Search (DARTS) has attracted extensive attention due to its efficiency in searching for cell structures. DARTS mainly focuses on the operation search and derives the cell topology from the operation weights. However, the operation weights can not indicate the importance of cell topology and result in poor topology rating correctness. To tackle this, we propose to Decoup...Show More
Much of the recent efforts on salient object detection (SOD) have been devoted to producing accurate saliency maps without being aware of their instance labels. To this end, we propose a new pipeline for end-to-end salient instance segmentation (SIS) that predicts a class-agnostic mask for each detected salient instance. To better use the rich feature hierarchies in deep networks and enhance the s...Show More
Recent progress on salient object detection (SOD) mostly benefits from the explosive development of Convolutional Neural Networks (CNNs). However, much of the improvement comes with the larger network size and heavier computation overhead, which, in our view, is not mobile-friendly and thus difficult to deploy in practice. To promote more practical SOD systems, we introduce a novel Stereoscopicall...Show More
Recently, salient object detection (SOD) has witnessed vast progress with the rapid development of convolutional neural networks (CNNs). However, the improvement of SOD accuracy comes with the increase in network depth and width, resulting in large network size and heavy computational overhead. This prevents state-of-the-art SOD methods from being deployed into practical platforms, especially mobi...Show More