Loading [MathJax]/extensions/MathZoom.js
Huaidong Zhang - IEEE Xplore Author Profile

Showing 1-20 of 20 results

Filter Results

Show

Results

Single-image 3D shape reconstruction has attracted significant attention with the advance of generative models. Recent studies have utilized diffusion models to achieve unprecedented shape reconstruction quality. However, these methods, in each sampling step, perform denoising in a single forward pass, leading to cumulative errors that severely impact the geometric consistency of the generated sha...Show More
The vulnerability of 3D point cloud analysis to unpredictable rotations poses an open yet challenging problem: orientation-aware 3D domain generalization. Cross-domain robustness and adaptability of 3D representations are crucial but not easily achieved through rotation augmentation. Motivated by the inherent advantages of intricate orientations in enhancing generalizability, we propose an innovat...Show More
Throughout history, static paintings have captivated viewers within display frames, yet the possibility of making these masterpieces vividly interactive remains intriguing. This research paper introduces 3DArtmator, a novel approach that aims to represent artforms in a highly interpretable stylized space, enabling 3D-aware animatable reconstruction and editing. Our rationale is to transfer the int...Show More
In Visual Question Answering (VQA), recognizing and localizing entities pose significant challenges. Pretrained vision-and-language models have addressed this problem by providing a text description as the answer. However, in visual scenes with multiple entities, textual descriptions struggle to distinguish the entities from the same category effectively. Consequently, the VQA dataset is limited b...Show More
Existing methods for asymmetric image retrieval employ a rigid pairwise similarity constraint between the query network and the larger gallery network. However, these one-to-one constraint approaches often fail to maintain retrieval order consistency, especially when the query network has limited representational capacity. To overcome this problem, we introduce the Decoupled Differential Distillat...Show More
In this paper, we delve into a novel aspect of learning novel diffusion conditions with datasets an order of magnitude smaller. The rationale behind our approach is the elimination of textual constraints during the few-shot learning process. To that end, we implement two optimization strategies. The first, prompt-free conditional learning, utilizes a prompt-free encoder derived from a pre-trained ...Show More
Lumbar disc herniation, as one of the most common spinal degeneration diseases, significantly affects the quality of people’s lives. Effective identification and diagnosis of this disease is highly demanded and crucial to improve lumbar disc health care. In this paper, we propose a unified framework for diagnosing multiple lumbar degeneration diseases in MRI. Considering the basis of diagnosis is ...Show More
Reversible face anonymization, unlike traditional face pixelization, seeks to replace sensitive identity information in facial images with synthesized alternatives, preserving privacy without sacrificing image clarity. Traditional methods, such as encoder-decoder networks, often result in significant loss of facial details due to their limited learning capacity. Additionally, relying on latent man...Show More
The degradation of printed photographs due to inadequate preservation is a major problem that can be addressed through deep learning-based restoration methods. However, these methods are often limited by their reliance on annotated data, making them less effective for new domains with limited training samples. In this paper, we propose a semi-supervised old photo restoration network that employs a...Show More
Text-to-image generation models have significantly broadened the horizons of creative expression through the power of natural language. However, navigating these models to generate unique concepts, alter their appearance, or reimagine them in unfamiliar roles presents an intricate challenge. For instance, how can we exploit language-guided models to transpose an anime character into a different ar...Show More
Previous Knowledge Distillation based efficient image retrieval methods employ a lightweight network as the stu-dent model for fast inference. However, the lightweight stu-dent model lacks adequate representation capacity for effective knowledge imitation during the most critical early training period, causing final performance degeneration. To tackle this issue, we propose a Capacity Dynamic Dist...Show More
Image generation relies on massive training data that can hardly produce diverse images of an unseen category according to a few examples. In this paper, we address this dilemma by projecting sparse few-shot samples into a continuous latent space that can potentially generate infinite unseen samples. The rationale behind is that we aim to locate a centroid latent position in a conditional StyleGAN...Show More
Using a sequence of discrete still images to tell a story or introduce a process has become a tradition in the field of digital visual media. With the surge in these media and the requirements in downstream tasks, acquiring their main topics or genres in a very short time is urgently needed. As a representative form of the media, comic enjoys a huge boom as it has gone digital. However, different ...Show More
Printed photographs can be easily warped, wrinkled, and even deteriorated over time. Existing methods treat the restoration of scratches as a pure inpainting problem that neglects the underlying corrupted contextual knowledge. However, important underlying contents are hidden behind the scratches, which are essential hints for producing a semantically consistent result. Motivated by this insight, ...Show More
Unsupervised domain adaptation for object detection aims to generalize the object detector trained on the label-rich source domain to the unlabeled target domain. Recently, existing works adopt the instance-level alignment or pixel-level alignment to perform domain transfer, which can effectively avoid the negative transfer due to the diverse background between domains. However, we find that they ...Show More
Ship classification is one of the most essential tasks in ship surveillance, which is an important but challenging problem. Most existing methods are designed for remote sensing images and there are few works processing natural ship images captured by camera. In this paper, we design a new framework called Adaptive Selecting and Learning Network (ASL), to solve the problem of fine-grained classifi...Show More
Deep learning has been recently demonstrated as an effective tool for raster-based sketch simplification. Nevertheless, it remains challenging to simplify extremely rough sketches. We found that a simplification network trained with a simple loss, such as pixel loss or discriminator loss, may fail to retain the semantically meaningful details when simplifying a very sketchy and complicated drawing...Show More
Temporal repetition counting aims to estimate the number of cycles of a given repetitive action. Existing deep learning methods assume repetitive actions are performed in a fixed time-scale, which is invalid for the complex repetitive actions in real life. In this paper, we tailor a context-aware and scale-insensitive framework, to tackle the challenges in repetition counting caused by the unknown...Show More
Unsupervised domain adaptation aims to generalize a model from the label-rich source domain to the unlabeled target domain. Existing works mainly focus on aligning the global distribution statistics between source and target domains. However, they neglect distractions from the unexpected noisy samples in domain distribution estimation, leading to domain misalignment or even negative transfer. In t...Show More
Taking photos through a glass window leads to glare or reflection, which might distract the viewer from the scene behind the window. In this paper, we involve user interaction to tackle the ill-posedness of the reflection removal problem. Users are allowed to draw strokes or lassos to indicate the background and reflection layers. Instead of designing hand-crafted features, we propose the edge-awa...Show More