I. Introduction
New devices enabling extended reality (XR) applications are driving the industry to explore novel more immersive services for the end users [1] [2]. One of the emerging experiences is volumetric video. Just like traditional two-dimensional (2D) video, volumetric video consists of a sequence of frames. But while a 2D video frame is fixed to a single predefined viewport, volumetric video frame represents a 3D space from which the viewer can generate novel views with different position and orientation. An application that could greatly benefit from volumetric video is real-time immersive communication. Akin to the 2D video system, delivery of volumetric video begins with capturing the content either from the real-world or from synthetic sources. Regardless of which method is used, the amount of data to represent the raw content is too large to be transmitted as is. One standardized method to compress this type of data is Visual Volumetric Video-based Coding (V3C) [3]. Where previous implementations have demonstrated on- demand delivery capabilities of pre-encoded V3C media [4], this paper presents a system architecture and encoder optimizations which enable real-time volumetric delivery using V3C.