Zhanglin Cheng - Academia.edu (original) (raw)
Papers by Zhanglin Cheng
arXiv: Computer Vision and Pattern Recognition, 2019
Most state-of-the-art semantic segmentation approaches only achieve high accuracy in good conditi... more Most state-of-the-art semantic segmentation approaches only achieve high accuracy in good conditions. In practically-common but less-discussed adverse environmental conditions, their performance can decrease enormously. Existing studies usually cast the handling of segmentation in adverse conditions as a separate post-processing step after signal restoration, making the segmentation performance largely depend on the quality of restoration. In this paper, we propose a novel deep-learning framework to tackle semantic segmentation and image restoration in adverse environmental conditions in a holistic manner. The proposed approach contains two components: Semantically-Guided Adaptation, which exploits semantic information from degraded images to refine the segmentation; and Exemplar-Guided Synthesis, which restores images from semantic label maps given degraded exemplars as the guidance. Our method cooperatively leverages the complementarity and interdependence of low-level restoration...
2019 International Conference on Virtual Reality and Visualization (ICVRV), 2019
Conventional interactive 3D tree modeling systems are generally based on 2D input devices, and it... more Conventional interactive 3D tree modeling systems are generally based on 2D input devices, and it’s not convenient to generate desired 3D tree shape from 2D inputs due to the complexity of 3D tree structures. In this paper, we present a system for modeling trees interactively using a 3D gesture-based VR platform. The system contains a head-mounted display (HMD) and a 6-DOF motion controller for interaction. We propose an improved procedural modeling method to generate trees faster for VR platform. Using the 6-DOF motion controller, users can manipulate tree structures by a set of 3D interactive operations, including geometric editing using 3D gestures, sketching, brushing and silhouette-guided growth. Our interactions are more flexible and convenient than using traditional 2D input devices, e.g., we allow the user to simultaneously rotate and translate parts of a tree using a 3D gesture.
ACM SIGGRAPH 2017 Posters, 2017
Plants are ubiquitous in the nature, and realistic plant modeling plays an important role in a va... more Plants are ubiquitous in the nature, and realistic plant modeling plays an important role in a variety of applications. Over the last decades, an immense amount of efforts have been dedicated to plant modeling. These approaches can be classified into two major categories: procedural modeling [Palubicki et al. 2009; Stava et al. 2014] and data-driven reconstruction approaches (e.g., photographs [Li et al. 2011; Tan et al. 2007] or scanned points [Livny et al. 2010; Xu et al. 2007]). Each approach has its own pros and cons. For example, procedural modeling approaches work well for synthesizing local branch structure details to produce botanically correct trees, but they lack the ability to control the growth of trees under certain shape constraints. While the data-driven approaches might precisely reconstruct skeletal structures, the botanical fidelity of trees are difficult to maintain.
IEEE Transactions on Geoscience and Remote Sensing
Automatic registration of point clouds captured by terrestrial laser scanning (TLS) plays an impo... more Automatic registration of point clouds captured by terrestrial laser scanning (TLS) plays an important role in many fields including remote sensing (e.g., transportation management, 3D reconstruction in large-scale urban areas and environment monitoring), computer vision, virtual reality and robotics, among others. However, noise, outliers, non-uniform point density and small overlaps are inevitable when collecting multiple views of data, which poses great challenges to 3D registration of point clouds. Since conventional registration methods aim to find point correspondences and estimate transformation parameters directly in the original point space, the traditional way to address these difficulties is to introduce many restrictions during the scanning process (e.g., more scanning and careful selection of scanning positions), thus making the data acquisition more difficult. In this paper, we present a novel 3D registration framework that performs in a "middle-level structural space" and is capable of robustly and efficiently reconstructing urban, semi-urban and indoor scenes, despite disturbances introduced in the scanning process. The new structural space is constructed by extracting multiple types of middle-level geometric primitives (planes, spheres, cylinders, and cones) from the 3D point cloud. We design a robust method to find effective primitive combinations corresponding to the 6D poses of the raw point clouds and then construct hybrid-structure-based descriptors. By matching descriptors and computing rotation and translation parameters, successful registration is achieved. Note that the whole process of our method is performed in the structural space, which has the advantages of capturing geometric structures (the relationship between primitives) and semantic features (primitive types and parameters) in larger fields. Experiments show that our method achieves state-of-the-art performance in several point cloud registration benchmark datasets at different scales and even obtains good registration results for data without overlapping areas.
The 14th International Symposium on Visual Information Communication and Interaction
Accurate segmentation of entity categories is the critical step for 3D scene understanding. This ... more Accurate segmentation of entity categories is the critical step for 3D scene understanding. This paper presents a fast deep neural network model with Dense Conditional Random Field (DCRF) as a post-processing method, which can perform accurate semantic segmentation for 3D point cloud scene. On this basis, a compact but flexible framework is introduced for performing segmentation to the semantics of point clouds concurrently, contribute to more precise segmentation. Moreover, based on semantics labels, a novel DCRF model is elaborated to refine the result of segmentation. Besides, without any sacrifice to accuracy, we apply optimization to the original data of the point cloud, allowing the network to handle fewer data. In the experiment, our proposed method is conducted comprehensively through four evaluation indicators, proving the superiority of our method.
Most state-of-the-art semantic segmentation or scene parsing approaches only achieve high accurac... more Most state-of-the-art semantic segmentation or scene parsing approaches only achieve high accuracy rates in good environmental conditions. The performance decrease enormously if images with unknown disturbances occur, which is less discussed but appears more in real applications. Most existing research works cast the handling of the challenging adverse conditions as a post-processing step of signal restoration or enhancement after sensing, then feed the restored images for visual understanding. However, the performance will largely depend on the quality of restoration or enhancement. Whether restoration-based approaches would actually boost the semantic segmentation performance remains questionable. In this paper, we propose a novel net framework to tackle semantic Segmentation and image Restoration in adverse environmental conditions (SR-Net). The proposed approach contains two components: Semantically-Guided Adaptation, which exploits and leverages semantic information from degrad...
Revista de la Real Academia de Ciencias Exactas, Físicas y Naturales. Serie A. Matemáticas
Curvatures are basic shape information of a surface, which are useful for 3D geometry analysis an... more Curvatures are basic shape information of a surface, which are useful for 3D geometry analysis and 3D reconstruction. A new robust algorithm is presented to estimate principal curvatures in this paper. The basic idea of this method is the local fitting of each normal section circle properties with the position information and the normal information at a neighbor point. In addition, a local feature curve, called as Normal Curvature Index Lines (NCIL), is constructed to show the fitting effect of all curvature estimation methods. With this accurate and robust curvature, some shape information of a point cloud surface has been obtained, such as saddle regions, sharp ridge regions and principle directions. Experimental results show that this work is more advantageous than similar approaches.
2021 IEEE Virtual Reality and 3D User Interfaces (VR)
2021 IEEE International Symposium on Mixed and Augmented Reality (ISMAR)
Proceedings of the 12th International Symposium on Visual Information Communication and Interaction
IEEE Transactions on Visualization and Computer Graphics
Computational Visual Media
A discriminative local shape descriptor plays an important role in various applications. In this ... more A discriminative local shape descriptor plays an important role in various applications. In this paper, we present a novel deep learning framework that derives discriminative local descriptors for deformable 3D shapes. We use local "geometry images" to encode the multi-scale local features of a point, via an intrinsic parameterization method based on geodesic polar coordinates. This new parameterization provides robust geometry images even for badly-shaped triangular meshes. Then a triplet network with shared architecture and parameters is used to perform deep metric learning; its aim is to distinguish between similar and dissimilar pairs of points. Additionally, a newly designed triplet loss function is minimized for improved, accurate training of the triplet network. To solve the dense correspondence problem, an efficient sampling approach is utilized to achieve a good compromise between training performance and descriptor quality. During testing, given a geometry image of a point of interest, our network outputs a discriminative local descriptor for it. Extensive testing of non-rigid dense shape matching on a variety of benchmarks demonstrates the superiority of the proposed descriptors over the state-of-the-art alternatives.
Computer Animation and Virtual Worlds
Sustainability
Educational institutions demand cost-effective and simple-to-use augmented reality systems. ARToo... more Educational institutions demand cost-effective and simple-to-use augmented reality systems. ARToolKit, an open-source computer tracking library for the creation of augmented reality applications that overlay virtual imagery on the real world, is such a system. It uses a simple camera and black-and-white markers printed on paper. However, due to inter-marker confusion, if the marker distinctions are not ensured, the markers are often miss-recognized. This paper presents an ARToolKit-based Interactive Writing Board (IWB) with a simple mechanism for designing confusion-free marker libraries. The board is used for teaching single characters of Arabic/Urdu to primary level students. It uses a simple ARToolKit marker for the recognition of each character. After marker recognition, the IWB displays the corresponding image, helping students with character understanding and pronunciation. Experimental results reveal that the system improves students’ motivation and learning skills.
arXiv: Computer Vision and Pattern Recognition, 2019
Most state-of-the-art semantic segmentation approaches only achieve high accuracy in good conditi... more Most state-of-the-art semantic segmentation approaches only achieve high accuracy in good conditions. In practically-common but less-discussed adverse environmental conditions, their performance can decrease enormously. Existing studies usually cast the handling of segmentation in adverse conditions as a separate post-processing step after signal restoration, making the segmentation performance largely depend on the quality of restoration. In this paper, we propose a novel deep-learning framework to tackle semantic segmentation and image restoration in adverse environmental conditions in a holistic manner. The proposed approach contains two components: Semantically-Guided Adaptation, which exploits semantic information from degraded images to refine the segmentation; and Exemplar-Guided Synthesis, which restores images from semantic label maps given degraded exemplars as the guidance. Our method cooperatively leverages the complementarity and interdependence of low-level restoration...
2019 International Conference on Virtual Reality and Visualization (ICVRV), 2019
Conventional interactive 3D tree modeling systems are generally based on 2D input devices, and it... more Conventional interactive 3D tree modeling systems are generally based on 2D input devices, and it’s not convenient to generate desired 3D tree shape from 2D inputs due to the complexity of 3D tree structures. In this paper, we present a system for modeling trees interactively using a 3D gesture-based VR platform. The system contains a head-mounted display (HMD) and a 6-DOF motion controller for interaction. We propose an improved procedural modeling method to generate trees faster for VR platform. Using the 6-DOF motion controller, users can manipulate tree structures by a set of 3D interactive operations, including geometric editing using 3D gestures, sketching, brushing and silhouette-guided growth. Our interactions are more flexible and convenient than using traditional 2D input devices, e.g., we allow the user to simultaneously rotate and translate parts of a tree using a 3D gesture.
ACM SIGGRAPH 2017 Posters, 2017
Plants are ubiquitous in the nature, and realistic plant modeling plays an important role in a va... more Plants are ubiquitous in the nature, and realistic plant modeling plays an important role in a variety of applications. Over the last decades, an immense amount of efforts have been dedicated to plant modeling. These approaches can be classified into two major categories: procedural modeling [Palubicki et al. 2009; Stava et al. 2014] and data-driven reconstruction approaches (e.g., photographs [Li et al. 2011; Tan et al. 2007] or scanned points [Livny et al. 2010; Xu et al. 2007]). Each approach has its own pros and cons. For example, procedural modeling approaches work well for synthesizing local branch structure details to produce botanically correct trees, but they lack the ability to control the growth of trees under certain shape constraints. While the data-driven approaches might precisely reconstruct skeletal structures, the botanical fidelity of trees are difficult to maintain.
IEEE Transactions on Geoscience and Remote Sensing
Automatic registration of point clouds captured by terrestrial laser scanning (TLS) plays an impo... more Automatic registration of point clouds captured by terrestrial laser scanning (TLS) plays an important role in many fields including remote sensing (e.g., transportation management, 3D reconstruction in large-scale urban areas and environment monitoring), computer vision, virtual reality and robotics, among others. However, noise, outliers, non-uniform point density and small overlaps are inevitable when collecting multiple views of data, which poses great challenges to 3D registration of point clouds. Since conventional registration methods aim to find point correspondences and estimate transformation parameters directly in the original point space, the traditional way to address these difficulties is to introduce many restrictions during the scanning process (e.g., more scanning and careful selection of scanning positions), thus making the data acquisition more difficult. In this paper, we present a novel 3D registration framework that performs in a "middle-level structural space" and is capable of robustly and efficiently reconstructing urban, semi-urban and indoor scenes, despite disturbances introduced in the scanning process. The new structural space is constructed by extracting multiple types of middle-level geometric primitives (planes, spheres, cylinders, and cones) from the 3D point cloud. We design a robust method to find effective primitive combinations corresponding to the 6D poses of the raw point clouds and then construct hybrid-structure-based descriptors. By matching descriptors and computing rotation and translation parameters, successful registration is achieved. Note that the whole process of our method is performed in the structural space, which has the advantages of capturing geometric structures (the relationship between primitives) and semantic features (primitive types and parameters) in larger fields. Experiments show that our method achieves state-of-the-art performance in several point cloud registration benchmark datasets at different scales and even obtains good registration results for data without overlapping areas.
The 14th International Symposium on Visual Information Communication and Interaction
Accurate segmentation of entity categories is the critical step for 3D scene understanding. This ... more Accurate segmentation of entity categories is the critical step for 3D scene understanding. This paper presents a fast deep neural network model with Dense Conditional Random Field (DCRF) as a post-processing method, which can perform accurate semantic segmentation for 3D point cloud scene. On this basis, a compact but flexible framework is introduced for performing segmentation to the semantics of point clouds concurrently, contribute to more precise segmentation. Moreover, based on semantics labels, a novel DCRF model is elaborated to refine the result of segmentation. Besides, without any sacrifice to accuracy, we apply optimization to the original data of the point cloud, allowing the network to handle fewer data. In the experiment, our proposed method is conducted comprehensively through four evaluation indicators, proving the superiority of our method.
Most state-of-the-art semantic segmentation or scene parsing approaches only achieve high accurac... more Most state-of-the-art semantic segmentation or scene parsing approaches only achieve high accuracy rates in good environmental conditions. The performance decrease enormously if images with unknown disturbances occur, which is less discussed but appears more in real applications. Most existing research works cast the handling of the challenging adverse conditions as a post-processing step of signal restoration or enhancement after sensing, then feed the restored images for visual understanding. However, the performance will largely depend on the quality of restoration or enhancement. Whether restoration-based approaches would actually boost the semantic segmentation performance remains questionable. In this paper, we propose a novel net framework to tackle semantic Segmentation and image Restoration in adverse environmental conditions (SR-Net). The proposed approach contains two components: Semantically-Guided Adaptation, which exploits and leverages semantic information from degrad...
Revista de la Real Academia de Ciencias Exactas, Físicas y Naturales. Serie A. Matemáticas
Curvatures are basic shape information of a surface, which are useful for 3D geometry analysis an... more Curvatures are basic shape information of a surface, which are useful for 3D geometry analysis and 3D reconstruction. A new robust algorithm is presented to estimate principal curvatures in this paper. The basic idea of this method is the local fitting of each normal section circle properties with the position information and the normal information at a neighbor point. In addition, a local feature curve, called as Normal Curvature Index Lines (NCIL), is constructed to show the fitting effect of all curvature estimation methods. With this accurate and robust curvature, some shape information of a point cloud surface has been obtained, such as saddle regions, sharp ridge regions and principle directions. Experimental results show that this work is more advantageous than similar approaches.
2021 IEEE Virtual Reality and 3D User Interfaces (VR)
2021 IEEE International Symposium on Mixed and Augmented Reality (ISMAR)
Proceedings of the 12th International Symposium on Visual Information Communication and Interaction
IEEE Transactions on Visualization and Computer Graphics
Computational Visual Media
A discriminative local shape descriptor plays an important role in various applications. In this ... more A discriminative local shape descriptor plays an important role in various applications. In this paper, we present a novel deep learning framework that derives discriminative local descriptors for deformable 3D shapes. We use local "geometry images" to encode the multi-scale local features of a point, via an intrinsic parameterization method based on geodesic polar coordinates. This new parameterization provides robust geometry images even for badly-shaped triangular meshes. Then a triplet network with shared architecture and parameters is used to perform deep metric learning; its aim is to distinguish between similar and dissimilar pairs of points. Additionally, a newly designed triplet loss function is minimized for improved, accurate training of the triplet network. To solve the dense correspondence problem, an efficient sampling approach is utilized to achieve a good compromise between training performance and descriptor quality. During testing, given a geometry image of a point of interest, our network outputs a discriminative local descriptor for it. Extensive testing of non-rigid dense shape matching on a variety of benchmarks demonstrates the superiority of the proposed descriptors over the state-of-the-art alternatives.
Computer Animation and Virtual Worlds
Sustainability
Educational institutions demand cost-effective and simple-to-use augmented reality systems. ARToo... more Educational institutions demand cost-effective and simple-to-use augmented reality systems. ARToolKit, an open-source computer tracking library for the creation of augmented reality applications that overlay virtual imagery on the real world, is such a system. It uses a simple camera and black-and-white markers printed on paper. However, due to inter-marker confusion, if the marker distinctions are not ensured, the markers are often miss-recognized. This paper presents an ARToolKit-based Interactive Writing Board (IWB) with a simple mechanism for designing confusion-free marker libraries. The board is used for teaching single characters of Arabic/Urdu to primary level students. It uses a simple ARToolKit marker for the recognition of each character. After marker recognition, the IWB displays the corresponding image, helping students with character understanding and pronunciation. Experimental results reveal that the system improves students’ motivation and learning skills.