Video coding of dynamic 3D point cloud data (original) (raw)
Related papers
The 24th International Conference on 3D Web Technology
In recent years, 3D point clouds have enjoyed a great popularity for representing both static and dynamic 3D objects. When compared to 3D meshes, they offer the advantage of providing a simpler, denser and more close-to-reality representation. However, point clouds always carry a huge amount of data. For a typical example of a point cloud with 0.7 million points per 3D frame at 30 fps, the point cloud raw video needs a bandwidth around 500MB/s. Thus, efficient compression methods are mandatory for ensuring the storage/transmission of such data, which include both geometry and attribute information. In the last years, the issue of 3D point cloud compression (3D-PCC) has emerged as a new field of research. In addition, an ISO/MPEG standardization process on 3D-PCC is currently ongoing. In this paper, a comprehensive overview of the 3D-PCC state-of-the-art methods is proposed. Different families of approaches are identified, described in details and summarized, including 1D traversal compression, 2D-oriented techniques, which take leverage of existing 2D image/video compression technologies and finally purely 3D approaches, based on a direct analysis of the 3D data. CCS CONCEPTS • Computing methodologies → Point-based models; • Information systems → Data compression; • General and reference → Surveys and overviews.
Adaptive Plane Projection for Video-Based Point Cloud Coding
2019
One of the most promising emerging 3D representation paradigms is the point cloud model, notably due to the new set of applications enabled, from immersive telepresence to 3D geographic information systems. Recognizing the potential of this representation model, MPEG has launched a standardization project to specify efficient point cloud coding solutions [1]. This project has given rise to the so-called Video-based Point Cloud Coding (V-PCC) standard, which projects a dynamic point cloud geometry and texture into a sequence of images to be coded with HEVC. In V-PCC, this projection is always performed using the same projection planes, independently of the point cloud. This paper proposes a more flexible coding solution, which adopts the V-PCC architecture but selects a different set of projection planes more adapted to the characteristics of the point clouds, to obtain a better compression performance. In this paper, this tool is applied to static point clouds but it may be extended to dynamic point clouds. The experimental results show an improvement of the geometry compression performance regarding V-PCC, especially for medium to larger rates.
A Vision-based Compression and Dissemination Framework for 3D Tele-immersive System
2012
Abstract: 3D Tele-Immersion (3DTI) system brings 3D data of people from geographically distributed locations into the same virtual space to enable interaction in 3D space. One main obstacle of designing 3DTI system is to overcome its high bandwidth requirement when disseminating the 3D data over the network. In this work, we present a novel compression and dissemination framework to reduce the bandwidth usage.
Improved video-based point cloud compression via segmentation
Research Square (Research Square), 2024
Point cloud is a representation of objects or scenes utilising unordered points comprising 3D positions and attributes. The ability of point clouds to mimic natural forms has gained significant attention from diverse applied fields, such as virtual reality and augmented reality. However, the point cloud, especially the dynamic one, must be compressed efficiently due to its huge data volume. The latest video-based point cloud compression (V-PCC) standard for dynamic point clouds divides the 3D point cloud into many patches using computationally expensive normal estimation, segmentation, and refinement. The patches are projected onto a 2D plane to apply existing video coding techniques. This process often results in losing proximity information and some original points. This loss induces artefacts that adversely affect user perception. The proposed method segments dynamic point clouds based on shape similarity and occlusion before patch generation. This segmentation strategy helps to maintain the proximity of the points and retain more original points by exploiting density and occlusion of the points. The experimental results establish that the proposed method significantly outperforms the V-PCC standard and other relevant methods in terms of rate-distortion performance and subjective quality testing for both geometric and texture data of several benchmark video sequences. Recent advances in computer vision have made realistic digital representations of 3D objects and environmental surroundings possible. This allows real-time and realistic physical-world interactions for users 1-3 , enabling real-world objects, people, and settings to move dynamically, applying 3D point clouds 4-6. A point cloud is a set of individual 3D points without any order or relationship among them in the space. Each point has a geometry position and includes several other attributes such as transparency, reflectance, colour, and normal 7. Dynamic point clouds are composed of a sequence of static three-dimensional point clouds, each representing a collection of sparsely sampled points taken from the continuous surfaces of objects and scenes. This unique structure serves as a powerful model for rendering realistic static and dynamic 3D objects 4, 8-10. The versatility of dynamic point clouds finds application in a broad spectrum of practical domains, encompassing geographic information systems, cultural heritage preservation, immersive telepresence, telehealth, and enhanced accessibility for individuals with disabilities. Furthermore, dynamic point clouds contribute to cutting-edge technologies such as 3D telepresence, telecommunication, autonomous driving, gaming, robotics, virtual reality (VR), and augmented reality (AR) 2, 11. Over the past decade, augmented and virtual reality have slowly entered the popular discourse and the Metaverse concept. The metaverse is a virtual world that can create a network where anyone can interact through their avatars. An avatar can be a digital representation of a player and works as the identity of a natural physical person. If the metaverse could be seamlessly connected with the physical environments in real time, it would transform our concept of reality 12. Hence, the imperative lies in delivering a 3D virtual environment of the greatest quality, characterized by high resolution, minimal noise, and exceptional clarity, in order to achieve the highest degree of authenticity. Nevertheless, creating such high-fidelity 3D content demands a substantial allocation of resources for storage, transmission, processing, and visualization. 1, 2, 13 Point clouds are categorized into three distinct groups, each with its designated standard and benchmark datasets to facilitate research comparisons. Category 1 pertains to static point clouds, exemplified by objects like statues and still scenes. Category 2 encompasses dynamic point clouds characterized by sequences involving human subjects. Lastly, Category 3 is reserved for dynamically acquired point clouds, a prime example being LiDAR point clouds 6-9. Notably, recent advancements have 2/19
Efficient dynamic point cloud coding using Slice-Wise Segmentation
Cornell University - arXiv, 2022
With the fast growth of immersive video sequences, achieving seamless and high-quality compressed 3D content is even more critical. MPEG recently developed a videobased point cloud compression (V-PCC) standard for dynamic point cloud coding. However, reconstructed point clouds using V-PCC suffer from different artifacts, including losing data during pre-processing before applying existing video coding techniques, e.g., High-Efficiency Video Coding (HEVC). Patch generations and self-occluded points in the 3D to the 2D projection are the main reasons for missing data using V-PCC. This paper proposes a new method that introduces overlapping slicing as an alternative to patch generation to decrease the number of patches generated and the amount of data lost. In the proposed method, the entire point cloud has been crosssectioned into variable-sized slices based on the number of selfoccluded points so that data loss can be minimized in the patch generation process and projection. For this, a variable number of layers are considered, partially overlapped to retain the selfoccluded points. The proposed method's added advantage is to reduce the bits requirement and to encode geometric data using the slicing base position. The experimental results show that the proposed method is much more flexible than the standard V-PCC method, improves the rate-distortion performance, and decreases the data loss significantly compared to the standard V-PCC method.
ArXiv, 2020
Rate distortion optimization plays a very important role in image/video coding. But for 3D point cloud, this problem has not been investigated. In this paper, the rate and distortion characteristics of 3D point cloud are investigated in detail, and a typical and challenging rate distortion optimization problem is solved for 3D point cloud. Specifically, since the quality of the reconstructed 3D point cloud depends on both the geometry and color distortions, we first propose analytical rate and distortion models for the geometry and color information in video-based 3D point cloud compression platform, and then solve the joint bit allocation problem for geometry and color based on the derived models. To maximize the reconstructed quality of 3D point cloud, the bit allocation problem is formulated as a constrained optimization problem and solved by an interior point method. Experimental results show that the rate-distortion performance of the proposed solution is close to that obtained...
SKIN-OFF: REPRESENTATION AND COMPRESSION SCHEME FOR 3D VIDEO
3D video records dynamic 3D visual events as is. The application areas of 3D video include wide varieties of human activities. To promote these applications in our everyday life, a standardized compression scheme for 3D video is required. In this paper, we propose a practical and effective scheme for representing and compressing 3D video named skin-off, in which both the geometric and visual information are efficiently represented by cutting a 3D mesh and mapping it onto a 2D array. Our skin-off scheme shares much with geometry videos, proposed by Hoppe et al. However, while geometry videos employ the 3D surface shape information alone to generate 2D images, the skin-off scheme we are proposing employs both 3D shape and texture information to generate them. This enables us to achieve higher image quality with limited bandwidth. Experimental results demonstrate the effectiveness of the skin-off scheme.
Dynamic Point Cloud Compression with Cross-Sectional Approach
2022
The recent development of dynamic point clouds has introduced the possibility of mimicking natural reality, and greatly assisting quality of life. However, to broadcast successfully, the dynamic point clouds require higher compression due to their huge volume of data compared to the traditional video. Recently, MPEG finalized a Videobased Point Cloud Compression standard known as V-PCC. However, V-PCC requires huge computational time due to expensive normal calculation and segmentation, sacrifices some points to limit the number of 2D patches, and cannot occupy all spaces in the 2D frame. The proposed method addresses these limitations by using a novel cross-sectional approach. This approach reduces expensive normal estimation and segmentation, retains more points, and utilizes more spaces for 2D frame generation compared to the V-PCC. The experimental results using standard video sequences show that the proposed technique can achieve better compression in both geometric and texture data compared to the V-PCC standard.
Volumetric video compression for interactive playback
Computer Vision and Image Understanding, 2004
We develop a volumetric video system which supports interactive browsing of compressed time-varying volumetric features (significant isosurfaces and interval volumes). Since the size of even one volumetric frame in a time-varying 3D data set is very large, transmission and on-line reconstruction are the main bottlenecks for interactive remote visualization of time-varying volume and surface data. We describe a compression scheme for encoding time-varying volumetric features in a unified way, which allows for on-line reconstruction and rendering. To increase the run-time decompression speed and compression ratio, we decompose the volume into small blocks and encode only the significant blocks that contribute to the isosurfaces and interval volumes. The results show that our compression scheme achieves high compression ratio with fast reconstruction, which is effective for client-side rendering of time-varying volumetric features.