Loading [MathJax]/extensions/MathZoom.js

Li Song - IEEE Xplore Author Profile

IEEE.org
IEEE Xplore
IEEE SA
IEEE Spectrum
More Sites

- Donate
- Personal Sign In

Institutional Sign In

Institutional Sign In

ADVANCED SEARCH

Author details

Li Song

Also published under: L. Song

Publications

197

Citations

1,537

Publications by Year

20062025

Co-Authors:

Lixun BaiJiang BianHans BurkhardtWeiyong CaiLiean Cao

Show All Co-Authors (274)

Li Song

Also published under: L. Song

Affiliation

School of Electronic Information and Electrical Engineering and the MoE Key Laboratory of Artificial Intelligence

AI Institute

Shanghai Jiao Tong University, Shanghai, China

Publication Topics

Compression Efficiency,
Latent Space,
Video Sequences,
Diffusion Model,
Generative Adversarial Networks,
Image Compression,
Multilayer Perceptron,
Quantization Parameter,
Video Coding,
3D Face,
Convolutional Layers,
Entropy Coding

Biography

Li Song (Senior Member, IEEE) received the B.E. and M.S. degrees in engineering in 1997 and 2000, respectively, and the Ph.D. degree in electrical engineering from Shanghai Jiao Tong University in 2005. He was a Faculty Member with Shanghai Jiao Tong University. He was also a Visiting Professor with Santa Clara University from 2011 to 2012. He is currently a Full Professor with the Department of Electronic Engineering. He has more than 300 publications, more than 50 granted patents, and 18 standards technical proposals in the field of video coding and image processing. His research interests include image processing, video coding, and multimedia systems. He was a recipient of the National Science and Technology Progress Award in 2015, the O...

Publications

197

Citations

1,537

Publications by Year

20062025

Co-Authors:

Lixun Bai
Jiang Bian
Hans Burkhardt
Weiyong Cai
Liean Cao

Show All Co-Authors (274)

Author's Published Works

Search History

Showing 1-25 of 197 results

Conferences (161)

Journals (35)

Early Access Articles (1)

Sort

Filter Results

Show

Open Access Only

Range
Single Year
Li Song(195)
Rong Xie(92)
Wenjun Zhang(43)
Xiaokang Yang(25)
Zhengyi Luo(23)
Institute of Image Communication and Network Engineering, Shanghai Jiao Tong University(56)
Institute of Image Communication and Network Engineering, Shanghai Jiao Tong University, Shanghai, China(46)
Cooperative Medianet Innovation Center, Shanghai, China(24)
School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai, China(9)
MoE Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University(9)
IEEE Transactions on Circuits and Systems for Video Technology(14)
IEEE Transactions on Broadcasting(7)
2017 IEEE Visual Communications and Image Processing (VCIP)(5)
2021 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)(5)
2024 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)(5)
IEEE(197)
Media(11)
Video(1)
Chengdu, China(10)
Toronto, ON, Canada(7)
Beijing, China(6)
Jeju, Korea (South)(5)
Niagara Falls, ON, Canada(5)
Video Coding(61)
Convolutional Neural Network(35)
Bitrate(33)
Video Quality(32)
Convolutional Layers(26)

Select All on Page

Sort By

Results

Implicit-Explicit Integrated Representations for Multi-View Video Compression

Chen Zhu;Guo Lu;Bing He;Rong Xie;Li Song

IEEE Transactions on Image Processing

Year: 2025 | Volume: 34 | Journal Article |

HTML

With the increasing consumption of 3D displays and virtual reality, multi-view video has become a promising format. However, its high resolution and multi-camera shooting result in a substantial increase in data volume, making storage and transmission a challenging task. To tackle these difficulties, we propose an implicit-explicit integrated representation for multi-view video compression. Specif...Show More

Implicit-Explicit Integrated Representations for Multi-View Video Compression

Chen Zhu;Guo Lu;Bing He;Rong Xie;Li Song

IEEE Transactions on Image Processing

Year: 2025 | Volume: 34 | Journal Article |

SSP-IR: Semantic and Structure Priors for Diffusion-based Realistic Image Restoration

Yuhong Zhang;Hengsheng Zhang;Zhengxue Cheng;Rong Xie;Li Song;Wenjun Zhang

IEEE Transactions on Circuits and Systems for Video Technology

Year: 2025 | Early Access Article |

Realistic image restoration is a crucial task in computer vision, and diffusion-based models for image restoration have garnered significant attention due to their ability to produce realistic results. Restoration can be seen as a controllable generation conditioning on priors. However, due to the severity of image degradation, existing diffusion-based restoration methods cannot fully exploit prio...Show More

SSP-IR: Semantic and Structure Priors for Diffusion-based Realistic Image Restoration

Yuhong Zhang;Hengsheng Zhang;Zhengxue Cheng;Rong Xie;Li Song;Wenjun Zhang

IEEE Transactions on Circuits and Systems for Video Technology

Year: 2025 | Early Access Article |

An Efficient and Flexible Complexity Control Method for Versatile Video Coding

Yan Zhao;Chen Zhu;Jun Xu;Guo Lu;Li Song;Siwei Ma

IEEE Transactions on Broadcasting

Year: 2025 | Volume: 71, Issue: 1 | Journal Article |

Cited by: Papers (1)

HTML

Recently, numerous complexity control approaches have been proposed to achieve the target encoding complexity. However, only few of them were developed for VVC encoders. This paper fills this gap by proposing an efficient and flexible complexity control approach for VVC. The support for both Acceleration Ratio Control (ARC) and Encoding Time Control (ETC) makes our method highly versatile for vari...Show More

An Efficient and Flexible Complexity Control Method for Versatile Video Coding

Yan Zhao;Chen Zhu;Jun Xu;Guo Lu;Li Song;Siwei Ma

IEEE Transactions on Broadcasting

Year: 2025 | Volume: 71, Issue: 1 | Journal Article |

Content-Adaptive Rate-Quality Curve Prediction Model in Media Processing System

Shibo Yin;Zhiyu Zhang;Peirong Ning;Qiubo Chen;Jing Chen;Quan Zhou;Guo Lu;Li Song

2024 IEEE International Conference on Visual Communications and Image Processing (VCIP)

Year: 2024 | Conference Paper |

HTML

In streaming media services, video transcoding is a common practice to alleviate bandwidth demands. Unfortunately, traditional methods employing a uniform rate factor (RF) across all videos often result in significant inefficiencies. Content-adaptive encoding (CAE) techniques address this by dynamically adjusting encoding parameters based on video content characteristics. However, existing CAE met...Show More

Content-Adaptive Rate-Quality Curve Prediction Model in Media Processing System

Shibo Yin;Zhiyu Zhang;Peirong Ning;Qiubo Chen;Jing Chen;Quan Zhou;Guo Lu;Li Song

2024 IEEE International Conference on Visual Communications and Image Processing (VCIP)

Year: 2024 | Conference Paper |

Coarse-to-fine Transformer For Lossless 3D Medical Image Compression

Xiaoxuan Yang;Guo Lu;Donghui Feng;Zhengxue Cheng;Guosheng Yu;Li Song

2024 IEEE International Conference on Visual Communications and Image Processing (VCIP)

Year: 2024 | Conference Paper |

HTML

The rapid advancements in medical imaging have led to a growing demand for high-performance lossless compression of large 3D medical image datasets. Unlike natural images, medical images typically feature three-dimensional structures, and high bit-depth, necessitating specialized compression techniques. Based on a decoder-only transformer, we propose a learnable dual-decoder model for lossless com...Show More

Coarse-to-fine Transformer For Lossless 3D Medical Image Compression

Xiaoxuan Yang;Guo Lu;Donghui Feng;Zhengxue Cheng;Guosheng Yu;Li Song

2024 IEEE International Conference on Visual Communications and Image Processing (VCIP)

Year: 2024 | Conference Paper |

AsymLLIC: Asymmetric Lightweight Learned Image Compression

Shen Wang;Zhengxue Cheng;Donghui Feng;Guo Lu;Li Song;Wenjun Zhang

2024 IEEE International Conference on Visual Communications and Image Processing (VCIP)

Year: 2024 | Conference Paper |

HTML

Learned image compression (LIC) methods often employ symmetrical encoder and decoder architectures, evitably increasing decoding time. However, practical scenarios demand an asymmetric design, where the decoder requires low complexity to cater to diverse low-end devices, while the encoder can accommodate higher complexity to improve coding performance. In this paper, we propose an asymmetric light...Show More

AsymLLIC: Asymmetric Lightweight Learned Image Compression

Shen Wang;Zhengxue Cheng;Donghui Feng;Guo Lu;Li Song;Wenjun Zhang

2024 IEEE International Conference on Visual Communications and Image Processing (VCIP)

Year: 2024 | Conference Paper |

Efficient Bitrate Ladder Construction for Per-Shot Adaptive Encoding

Yan Zhao;ZhengXue Cheng;Guo Lu;Rong Xie;Li Song

2024 IEEE International Conference on Visual Communications and Image Processing (VCIP)

Year: 2024 | Conference Paper |

HTML

HTTP adaptive streaming (HAS) constructs bitrate ladders to deliver videos with the best possible quality under varying network conditions. Though per-shot content adaptive encoding (CAE) largely improves the compression efficiency by constructing the optimal bitrate ladder for each video shot, it suffers from excessive encoding complexity as all the points in the operating space (typically resolu...Show More

Efficient Bitrate Ladder Construction for Per-Shot Adaptive Encoding

Yan Zhao;ZhengXue Cheng;Guo Lu;Rong Xie;Li Song

2024 IEEE International Conference on Visual Communications and Image Processing (VCIP)

Year: 2024 | Conference Paper |

SingAvatar: High-fidelity Audio-driven Singing Avatar Synthesis

Wentao Ma;Anni Tang;Jun Ling;Han Xue;Huiheng Liao;Yunhui Zhu;Li Song

2024 IEEE International Conference on Multimedia and Expo (ICME)

Year: 2024 | Conference Paper |

HTML

Generating photo-realistic avatars from audio plays an important role in extended reality (XR) and metaverse. In this paper, we lift the input audio from speech to singing, which has been rarely studied. The significant distinction between singing and talking poses great challenges for adapting talking face generation methods to the singing regime. To address this, we propose a high-fidelity singi...Show More

SingAvatar: High-fidelity Audio-driven Singing Avatar Synthesis

Wentao Ma;Anni Tang;Jun Ling;Han Xue;Huiheng Liao;Yunhui Zhu;Li Song

2024 IEEE International Conference on Multimedia and Expo (ICME)

Year: 2024 | Conference Paper |

Efficient Dynamic-NeRF Based Volumetric Video Coding with Rate Distortion Optimization

Zhiyu Zhang;Guo Lu;Huanxiong Liang;Anni Tang;Qiang Hu;Li Song

2024 IEEE International Conference on Multimedia and Expo (ICME)

Year: 2024 | Conference Paper |

Cited by: Papers (2)

HTML

Volumetric videos, benefiting from immersive 3D realism and interactivity, hold vast potential for various applications, while the tremendous data volume poses significant challenges for compression. Recently, NeRF has demonstrated remarkable potential in volumetric video compression thanks to its simple representation and powerful 3D modeling capabilities, where a notable work is ReRF. However, R...Show More

Efficient Dynamic-NeRF Based Volumetric Video Coding with Rate Distortion Optimization

Zhiyu Zhang;Guo Lu;Huanxiong Liang;Anni Tang;Qiang Hu;Li Song

2024 IEEE International Conference on Multimedia and Expo (ICME)

Year: 2024 | Conference Paper |

A New People-Object Interaction Dataset and NVS Benchmarks

Shuai Guo;Houqiang Zhong;Qiuwen Wang;Ziyu Chen;Yijie Gao;Jiajing Yuan;Chenyu Zhang;Rong Xie;Li Song

2024 IEEE International Conference on Image Processing (ICIP)

Year: 2024 | Conference Paper |

HTML

Recently, NVS in human-object interaction scenes has received increasing attention. Existing human-object interaction datasets mainly consist of static data with limited views, offering only RGB images or videos, mostly containing interactions between a single person and objects. Moreover, these datasets exhibit complexities in lighting environments, poor synchronization, and low resolution, hinde...Show More

A New People-Object Interaction Dataset and NVS Benchmarks

Shuai Guo;Houqiang Zhong;Qiuwen Wang;Ziyu Chen;Yijie Gao;Jiajing Yuan;Chenyu Zhang;Rong Xie;Li Song

2024 IEEE International Conference on Image Processing (ICIP)

Year: 2024 | Conference Paper |

Long-Term and Short-Term Information Propagation and Fusion for Learned Video Compression

Shen Wang;Donghui Feng;Guo Lu;Zhengxue Cheng;Li Song;Wenjun Zhang

IEEE Transactions on Broadcasting

Year: 2024 | Volume: 70, Issue: 4 | Journal Article |

HTML

In recent years, numerous learned video compression (LVC) methods have emerged, demonstrating rapid developments and satisfactory performance. However, in most previous methods, only the previous one frame is used as reference. Although some works introduce the usage of the previous multiple frames, the exploitation of temporal information is not comprehensive. Our proposed method not only utilize...Show More

Long-Term and Short-Term Information Propagation and Fusion for Learned Video Compression

Shen Wang;Donghui Feng;Guo Lu;Zhengxue Cheng;Li Song;Wenjun Zhang

IEEE Transactions on Broadcasting

Year: 2024 | Volume: 70, Issue: 4 | Journal Article |

Visibility-Aware Human Mesh Recovery via Balancing Dense Correspondence and Probability Model

Yanjun Wang;Wenjia Wang;Jun Ling;Rong Xie;Li Song

2024 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)

Year: 2024 | Conference Paper |

HTML

Reconstructing the human body mesh often faces challenges like self-occlusion, object occlusion, and interference from other people. However, focusing on the model's robustness in scenarios of occlusion leads to a compromise in the accuracy of estimating non-occluded humans. Striking the right balance is a research question worth exploring. In this study, we introduce the Visibility-aware Human Me...Show More

Visibility-Aware Human Mesh Recovery via Balancing Dense Correspondence and Probability Model

Yanjun Wang;Wenjia Wang;Jun Ling;Rong Xie;Li Song

2024 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)

Year: 2024 | Conference Paper |

Multimodal Semantic-Aware Automatic Colorization with Diffusion Prior

Han Wang;Xinning Chai;Yiwen Wang;Yuhong Zhang;Rong Xie;Li Song

2024 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)

Year: 2024 | Conference Paper |

HTML

Colorizing grayscale images offers an engaging visual experience. Existing automatic colorization methods often fail to generate satisfactory results due to incorrect semantic colors and unsaturated colors. In this work, we propose an automatic colorization pipeline to overcome these challenges. We leverage the extraordinary generative ability of the diffusion prior to synthesize color with plausi...Show More

Multimodal Semantic-Aware Automatic Colorization with Diffusion Prior

Han Wang;Xinning Chai;Yiwen Wang;Yuhong Zhang;Rong Xie;Li Song

2024 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)

Year: 2024 | Conference Paper |

LFCAVE: Interactive 3D Space with Multiple Light Field Displays

Haopeng Lu;Wenkang Shan;Yuhuai Zhang;Li Song;Xinfeng Zhang;Siwei Ma;Liuxin Zhang;Wen Gao

2024 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)

Year: 2024 | Conference Paper |

HTML

We introduce LFCAVE, an interactive 3D display system comprised of display and interaction modules. In the display aspect, we have developed a multi-screen light field model that incorporates multiple consumer-grade light field displays for seamless multi-screen presentations. Compared to traditional single-screen setups, our system offers an expanded viewing angle and accommodates a larger number...Show More

LFCAVE: Interactive 3D Space with Multiple Light Field Displays

Haopeng Lu;Wenkang Shan;Yuhuai Zhang;Li Song;Xinfeng Zhang;Siwei Ma;Liuxin Zhang;Wen Gao

2024 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)

Year: 2024 | Conference Paper |

Detailed and Controllable Old Photo Restoration with Diffusion Priors

Xibei Liu;Han Wang;Yiwen Wang;Yuhong Zhang;Jiayi Song;Rong Xie;Li Song

2024 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)

Year: 2024 | Conference Paper |

HTML

Restoring old photos that contain numerous unknown and complex defects is a challenging and ill-posed problem. Traditional methods often struggle to address both structured and unstructured defects in real old photos, frequently leading to over-smoothed and uncompleted results. In this paper, we exploit powerful diffusion priors to construct a novel solution for the restoration of old photos. Our ...Show More

Detailed and Controllable Old Photo Restoration with Diffusion Priors

Xibei Liu;Han Wang;Yiwen Wang;Yuhong Zhang;Jiayi Song;Rong Xie;Li Song

2024 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)

Year: 2024 | Conference Paper |

A Priority Aware Free Viewpoint Video Transmit Scheme Based on QUIC

Junyi Lu;Bingcong Lu;Jun Xu;Li Song;Wenjun Zhang

2024 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)

Year: 2024 | Conference Paper |

HTML

Theadvent of Free Viewpoint Video (FVV) marks a significant evolution in internet video services, moving from traditional formats to more interactive and immersive experiences. Current free viewpoint video transmission systems face several challenges, including insufficient scalability in high-concurrency scenarios and additional response delays during interaction. To address these issues, we prop...Show More

A Priority Aware Free Viewpoint Video Transmit Scheme Based on QUIC

Junyi Lu;Bingcong Lu;Jun Xu;Li Song;Wenjun Zhang

2024 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)

Year: 2024 | Conference Paper |

Identity-Consistent Video De-identification via Diffusion Autoencoders

Yunhui Zhu;Jingyi Cao;Bo Liu;Tingxi Chen;Rong Xie;Li Song

2024 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)

Year: 2024 | Conference Paper |

HTML

With the rise of deep learning and the widespread use of face recognition, face image privacy has become a critical research issue. Face de-identification is acknowledged as effective for protecting identity privacy. As media formats diversify, it is imperative to extend privacy protection to videos. Addressing the core problem of identity consistency between frames, we propose a video de-identifi...Show More

Identity-Consistent Video De-identification via Diffusion Autoencoders

Yunhui Zhu;Jingyi Cao;Bo Liu;Tingxi Chen;Rong Xie;Li Song

2024 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)

Year: 2024 | Conference Paper |

LaEC: Loss-aware Earliest Completion Data Scheduler for Multi-site Parallel Downloading

Bingqi Li;Jun Xu;Bingcong Lu;Junyi Lu;Rong Xie;Li Song

2024 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)

Year: 2024 | Conference Paper |

HTML

The multi-path data scheduler stands as the pivotal element profoundly influencing the performance of any multipath transport. Multi-site parallel downloading (MPD), which emerges as an alternative to the costly traditional dedicated Content Delivery Network (CDN), requests video segments from multiple economical edge data nodes simultaneously. However, the existing data scheduler for MPD solely p...Show More

LaEC: Loss-aware Earliest Completion Data Scheduler for Multi-site Parallel Downloading

Bingqi Li;Jun Xu;Bingcong Lu;Junyi Lu;Rong Xie;Li Song

2024 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)

Year: 2024 | Conference Paper |

No-reference Quality Assessment of Text-to-Image Generation

Haitao Huang;Rongli Jia;Yuhong Zhang;Rong Xie;Li Song;Lin Li;Yanan Feng

2024 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)

Year: 2024 | Conference Paper |

HTML

This paper proposes a novel no-reference quality assessment method for text-to-image generation. Text-to-image refers to the process of generating image content from textual descriptions using deep learning models. Although advances in technology and improvements in models have made it possible to generate some high-quality images, some generated images still exhibit unique distortions that reflec...Show More

No-reference Quality Assessment of Text-to-Image Generation

Haitao Huang;Rongli Jia;Yuhong Zhang;Rong Xie;Li Song;Lin Li;Yanan Feng

2024 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)

Year: 2024 | Conference Paper |

Fast Video Deduplication and Localization With Temporal Consistence Re-Ranking

Chris Henry;Li Song;Zhu Li

IEEE Transactions on Circuits and Systems for Video Technology

Year: 2024 | Volume: 34, Issue: 11 | Journal Article |

HTML

The use of social media networks and mobile devices has experienced tremendous growth in recent years. This has led to a surge in the number of videos recorded and uploaded to social media platforms like TikTok and YouTube. However, this increase has also resulted in the rise of illegal duplicate videos, which are essentially the same as the original videos but with minor editing effects and varia...Show More

Fast Video Deduplication and Localization With Temporal Consistence Re-Ranking

Chris Henry;Li Song;Zhu Li

IEEE Transactions on Circuits and Systems for Video Technology

Year: 2024 | Volume: 34, Issue: 11 | Journal Article |

Memories are One-to-Many Mapping Alleviators in Talking Face Generation

Anni Tang;Tianyu He;Xu Tan;Jun Ling;Runnan Li;Sheng Zhao;Jiang Bian;Li Song

IEEE Transactions on Pattern Analysis and Machine Intelligence

Year: 2024 | Volume: 46, Issue: 12 | Journal Article |

Cited by: Papers (1)

HTML

Talking face generation aims at generating photo-realistic video portraits of a target person driven by input audio. According to the nature of audio to lip motions mapping, the same speech content may have different appearances even for the same person at different occasions. Such one-to-many mapping problem brings ambiguity during training and thus causes inferior visual results. Although this o...Show More

Memories are One-to-Many Mapping Alleviators in Talking Face Generation

Anni Tang;Tianyu He;Xu Tan;Jun Ling;Runnan Li;Sheng Zhao;Jiang Bian;Li Song

IEEE Transactions on Pattern Analysis and Machine Intelligence

Year: 2024 | Volume: 46, Issue: 12 | Journal Article |

Depth-Guided Robust Point Cloud Fusion NeRF for Sparse Input Views

Shuai Guo;Qiuwen Wang;Yijie Gao;Rong Xie;Lin Li;Fang Zhu;Li Song

IEEE Transactions on Circuits and Systems for Video Technology

Year: 2024 | Volume: 34, Issue: 9 | Journal Article |

Cited by: Papers (3)

HTML

Novel-view synthesis with sparse input views is important for practical applications such as AR/VR and autonomous driving. Many works in this field have already integrated depth information into NeRF, utilizing depth priors for assistance in geometric and spatial understanding. However, most existing work tends to either overlook the inaccuracies in depth maps or only handle them roughly, limiting...Show More

Depth-Guided Robust Point Cloud Fusion NeRF for Sparse Input Views

Shuai Guo;Qiuwen Wang;Yijie Gao;Rong Xie;Lin Li;Fang Zhu;Li Song

IEEE Transactions on Circuits and Systems for Video Technology

Year: 2024 | Volume: 34, Issue: 9 | Journal Article |

A Character Position-Aware Compression Framework for Screen Text Image

Chen Zhu;Guo Lu;Huanbang Chen;Donghui Feng;Shen Wang;Yan Zhao;Rong Xie;Li Song

IEEE Transactions on Circuits and Systems for Video Technology

Year: 2024 | Volume: 34, Issue: 9 | Journal Article |

HTML

Text patterns typically exhibit distinct boundaries and sparse color histograms. However, in current hybrid codec frameworks, the positions of coding units are often misaligned with the text patterns, resulting in prediction and color mapping tools consuming a large number of bits to indicate these patterns. Nowadays, some text detection and recognition methods have been proposed to accurately loc...Show More

A Character Position-Aware Compression Framework for Screen Text Image

Chen Zhu;Guo Lu;Huanbang Chen;Donghui Feng;Shen Wang;Yan Zhao;Rong Xie;Li Song

IEEE Transactions on Circuits and Systems for Video Technology

Year: 2024 | Volume: 34, Issue: 9 | Journal Article |

Real-Time Free Viewpoint Video Synthesis System Based on DIBR and a Depth Estimation Network

Shuai Guo;Jingchuan Hu;Kai Zhou;Jionghao Wang;Li Song;Rong Xie;Wenjun Zhang

IEEE Transactions on Multimedia

Year: 2024 | Volume: 26 | Journal Article |

Cited by: Papers (6)

HTML

Depth image-based rendering (DIBR) view synthesis is the most widely employed method in real-time FVV research. Despite recent progress, most DIBR-based FVV synthesis approaches are not sufficiently simple and effective in filling holes and artifacts. Additionally, they use RGB-D cameras, which are difficult to widely adopt or take considerable time to estimate high-quality depth images. This arti...Show More

Real-Time Free Viewpoint Video Synthesis System Based on DIBR and a Depth Estimation Network

Shuai Guo;Jingchuan Hu;Kai Zhou;Jionghao Wang;Li Song;Rong Xie;Wenjun Zhang

IEEE Transactions on Multimedia

Year: 2024 | Volume: 26 | Journal Article |

EffiHDR: An Efficient Framework for HDRTV Reconstruction and Enhancement in UHD Systems

Hengsheng Zhang;Xueyi Zou;Guo Lu;Li Chen;Li Song;Wenjun Zhang

IEEE Transactions on Broadcasting

Year: 2024 | Volume: 70, Issue: 2 | Journal Article |

HTML

Recent advancements in SDRTV-to-HDRTV conversion have yielded impressive results in reconstructing high dynamic range television (HDRTV) videos from standard dynamic range television (SDRTV) videos. However, the practical applications of these techniques are limited for ultra-high definition (UHD) video systems due to their high computational and memory costs. In this paper, we propose EffiHDR, an...Show More

EffiHDR: An Efficient Framework for HDRTV Reconstruction and Enhancement in UHD Systems

Hengsheng Zhang;Xueyi Zou;Guo Lu;Li Chen;Li Song;Wenjun Zhang

IEEE Transactions on Broadcasting

Year: 2024 | Volume: 70, Issue: 2 | Journal Article |

IEEE Personal Account

Change username/password

Purchase Details

Payment Options
View Purchased Documents

Profile Information

Communications Preferences
Profession and Education
Technical interests

Need Help?

US & Canada: +1 800 678 4333
Worldwide: +1 732 981 0060
Contact & Support

Follow

About IEEE Xplore | Contact Us | Help | Accessibility | Terms of Use | Nondiscrimination Policy | IEEE Ethics Reporting | Sitemap | IEEE Privacy Policy

A public charity, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity.

© Copyright 2025 IEEE - All rights reserved, including rights for text and data mining and training of artificial intelligence and similar technologies.

IEEE Account

Change Username/Password
Update Address

Purchase Details

Payment Options
Order History
View Purchased Documents

Profile Information

Communications Preferences
Profession and Education
Technical Interests

Need Help?

US & Canada: +1 800 678 4333
Worldwide: +1 732 981 0060
Contact & Support

About IEEE Xplore
Contact Us
Help
Accessibility
Terms of Use
Nondiscrimination Policy
Sitemap
Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity.
© Copyright 2025 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.