Deep FusionNet for Point Cloud Semantic Segmentation (original) (raw)

Evaluation of Class Distribution and Class Combinations on Semantic Segmentation of 3D Point Clouds With PointNet

IEEE Access

Point clouds are generated by light imaging, detection and ranging (LIDAR) scanners or depth imaging cameras, which capture the geometry from the scanned objects with high accuracy. Unfortunately, these systems are unable to identify the semantics of the objects. Semantic 3D point clouds are an important basis for modeling the real world in digital applications. Manual semantic segmentation is a labor and cost intensive task. Automation of semantic segmentation using machine learning and deep learning (DL) approaches is therefore an interesting subject of research. In particular, point-based network architectures, such as PointNet, lead to a beneficial semantic segmentation in individual applications. For the application of DL methods, a large number of hyperparameters (HPs) have to be determined and these HPs influence the training success. In our work, the investigated HPs are the class distribution and the class combination. By means of seven combinations of classes following a hierarchical scheme and four methods to adapt the class sizes, these HPs are investigated in a detailed and structured manner. The investigated settings show an increased semantic segmentation performance, by an increase of 31% in recall for the class Erroneous points or that all classes have a recall of higher than 50%. However, based on our results the correct setting of only these HPs does not lead to a simple, universal and practical semantic segmentation procedure. INDEX TERMS 3D point clouds, data hyperparameter, hierarchical class combination, hyperparameter, PointNet, semantic classes, semantic segmentation, unbalanced data. FIGURE 6. Point cloud dataset from the main building of HafenCity University Hamburg (entrance level). FIGURE 7. Point cloud dataset from the main building of HafenCity University Hamburg (office level). FIGURE 8. Point cloud dataset from the main building of HafenCity University Hamburg (lecture hall level).

PointResNet: Residual Network for 3D Point Cloud Segmentation and Classification

Cornell University - arXiv, 2022

Point cloud segmentation and classification are some of the primary tasks in 3D computer vision with applications ranging from augmented reality to robotics. However, processing point clouds using deep learning-based algorithms is quite challenging due to the irregular point formats. Voxelization or 3D grid-based representation are different ways of applying deep neural networks to this problem. In this paper, we propose PointResNet, a residual block-based approach. Our model directly processes the 3D points, using a deep neural network for the segmentation and classification tasks. The main components of the architecture are: 1) residual blocks and 2) multi-layered perceptron (MLP). We show that it preserves profound features and structural information, which are useful for segmentation and classification tasks. The experimental evaluations demonstrate that the proposed model produces the best results for segmentation and comparable results for classification in comparison to the conventional baselines.

DPRNet: Deep 3D Point based Residual Network for Semantic Segmentation and Classification of 3D Point Clouds

Point clouds are an important type of geometric data obtained from a variety of 3D sensors. They do not have an explicit neighborhood structure and therefore several researchers often perform a voxelization step to obtain structured 3D neighborhood. This, however, comes with certain disadvantages. e.g., it makes the data unnecessarily voluminous, enforces additional computation effort and can potentially introduce quantization errors that may not only hinder in extracting implicit 3D shape information but also in capturing the essential data invariances for the required segmentation and recognition task. In this context, this paper addresses the highly challenging problem of semantic segmentation and 3D object recognition using raw unstructured 3D point cloud data. Specifically, a deep network architecture has been proposed which consists of a cascaded combination of 3D point based residual networks for simultaneous semantic scene segmentation and object classification. It exploits the 3D point based convolutions for representational learning from raw unstructured 3D point cloud data. The proposed architecture has a simple design, easier implementation and the performance which is better than the existing state-of-the architectures particularly for semantic scene segmentation over three public datasets. The implementation and evaluation is made public here https://github.com/saira05/DPRNet.

Investigation of Pointnet for Semantic Segmentation of Large-Scale Outdoor Point Clouds

The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 2021

Abstract. Semantic segmentation of point clouds is indispensable for 3D scene understanding. Point clouds have credibility for capturing geometry of objects including shape, size, and orientation. Deep learning (DL) has been recognized as the most successful approach for image semantic segmentation. Applied to point clouds, performance of the many DL algorithms degrades, because point clouds are often sparse and have irregular data format. As a result, point clouds are regularly first transformed into voxel grids or image collections. PointNet was the first promising algorithm that feeds point clouds directly into the DL architecture. Although PointNet achieved remarkable performance on indoor point clouds, its performance has not been extensively studied in large-scale outdoor point clouds. So far, we know, no study on large-scale aerial point clouds investigates the sensitivity of the hyper-parameters used in the PointNet. This paper evaluates PointNet’s performance for semantic s...

Deep Learning for Semantic Segmentation of 3D Point Cloud

The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 2019

Cultural Heritage is a testimony of past human activity, and, as such, its objects exhibit great variety in their nature, size and complexity; from small artefacts and museum items to cultural landscapes, from historical building and ancient monuments to city centers and archaeological sites. Cultural Heritage around the globe suffers from wars, natural disasters and human negligence. The importance of digital documentation is well recognized and there is an increasing pressure to document our heritage both nationally and internationally. For this reason, the three-dimensional scanning and modeling of sites and artifacts of cultural heritage have remarkably increased in recent years. The semantic segmentation of point clouds is an essential step of the entire pipeline; in fact, it allows to decompose complex architectures in single elements, which are then enriched with meaningful information within Building Information Modelling software. Notwithstanding, this step is very time consuming and completely entrusted on the manual work of domain experts, far from being automatized. This work describes a method to label and cluster automatically a point cloud based on a supervised Deep Learning approach, using a state-of-the-art Neural Network called PointNet++. Despite other methods are known, we have choose PointNet++ as it reached significant results for classifying and segmenting 3D point clouds. PointNet++ has been tested and improved, by training the network with annotated point clouds coming from a real survey and to evaluate how performance changes according to the input training data. It can result of great interest for the research community dealing with the point cloud semantic segmentation, since it makes public a labelled dataset of CH elements for further tests.

A Prior Level Fusion Approach for the Semantic Segmentation of 3D Point Clouds Using Deep Learning

Remote Sensing

Three-dimensional digital models play a pivotal role in city planning, monitoring, and sustainable management of smart and Digital Twin Cities (DTCs). In this context, semantic segmentation of airborne 3D point clouds is crucial for modeling, simulating, and understanding large-scale urban environments. Previous research studies have demonstrated that the performance of 3D semantic segmentation can be improved by fusing 3D point clouds and other data sources. In this paper, a new prior-level fusion approach is proposed for semantic segmentation of large-scale urban areas using optical images and point clouds. The proposed approach uses image classification obtained by the Maximum Likelihood Classifier as the prior knowledge for 3D semantic segmentation. Afterwards, the raster values from classified images are assigned to Lidar point clouds at the data preparation step. Finally, an advanced Deep Learning model (RandLaNet) is adopted to perform the 3D semantic segmentation. The result...

RIU-Net: Embarrassingly simple semantic segmentation of 3D LiDAR point cloud

2019

This paper proposes RIU-Net (for Range-Image U-Net), the adaptation of a popular semantic segmentation network for the semantic segmentation of a 3D LiDAR point cloud. The point cloud is turned into a 2D range-image by exploiting the topology of the sensor. This image is then used as input to a U-net. This architecture has already proved its efficiency for the task of semantic segmentation of medical images. We demonstrate how it can also be used for the accurate semantic segmentation of a 3D LiDAR point cloud and how it represents a valid bridge between image processing and 3D point cloud processing. Our model is trained on range-images built from KITTI 3D object detection dataset. Experiments show that RIU-Net, despite being very simple, offers results that are comparable to the state-of-the-art of range-image based methods. Finally, we demonstrate that this architecture is able to operate at 90fps on a single GPU, which enables deployment for real-time segmentation.

PIG-Net: Inception based deep learning architecture for 3D point cloud segmentation

Computers & Graphics, 2021

Point clouds, being the simple and compact representation of surface geometry of 3D objects, have gained increasing popularity with the evolution of deep learning networks for classification and segmentation tasks. Unlike human, teaching the machine to analyze the segments of an object is a challenging task and quite essential in various machine vision applications. In this paper, we address the problem of segmentation and labelling of the 3D point clouds by proposing a inception based deep network architecture called PIG-Net, that effectively characterizes the local and global geometric details of the point clouds. In PIG-Net, the local features are extracted from the transformed input points using the proposed inception layers and then aligned by feature transform. These local features are aggregated using the global average pooling layer to obtain the global features. Finally, feed the concatenated local and global features to the convolution layers for segmenting the 3D point clouds. We perform an exhaustive experimental analysis of the PIG-Net architecture on two state-of-the-art datasets, namely, ShapeNet [1] and PartNet [2]. We evaluate the effectiveness of our network by performing ablation study.

VV-Net: Voxel VAE Net With Group Convolutions for Point Cloud Segmentation

2019 IEEE/CVF International Conference on Computer Vision (ICCV)

We present a novel algorithm for point cloud segmentation. Our approach transforms unstructured point clouds into regular voxel grids, and further uses a kernel-based interpolated variational autoencoder (VAE) architecture to encode the local geometry within each voxel. Traditionally, the voxel representation only comprises Boolean occupancy information which fails to capture the sparsely distributed points within voxels in a compact manner. In order to handle sparse distributions of points, we further employ radial basis functions (RBF) to compute a local, continuous representation within each voxel. Our approach results in a good volumetric representation that effectively tackles noisy point cloud datasets and is more robust for learning. Moreover, we further introduce group equivariant CNN to 3D, by defining the convolution operator on a symmetry group acting on Z 3 and its isomorphic sets. This improves the expressive capacity without increasing parameters, leading to more robust segmentation results. We highlight the performance on standard benchmarks and show that our approach outperforms state-of-the-art segmentation algorithms on the ShapeNet and S3DIS datasets.

Voxel-Based 3D Point Cloud Semantic Segmentation: Unsupervised Geometric and Relationship Featuring vs Deep Learning Methods

ISPRS International Journal of Geo-Information, 2019

Automation in point cloud data processing is central in knowledge discovery within decision-making systems. The definition of relevant features is often key for segmentation and classification, with automated workflows presenting the main challenges. In this paper, we propose a voxel-based feature engineering that better characterize point clusters and provide strong support to supervised or unsupervised classification. We provide different feature generalization levels to permit interoperable frameworks. First, we recommend a shape-based feature set (SF1) that only leverages the raw X, Y, Z attributes of any point cloud. Afterwards, we derive relationship and topology between voxel entities to obtain a three-dimensional (3D) structural connectivity feature set (SF2). Finally, we provide a knowledge-based decision tree to permit infrastructure-related classification. We study SF1/SF2 synergy on a new semantic segmentation framework for the constitution of a higher semantic representation of point clouds in relevant clusters. Finally, we benchmark the approach against novel and best-performing deep-learning methods while using the full S3DIS dataset. We highlight good performances, easy-integration, and high F1-score (> 85%) for planar-dominant classes that are comparable to state-of-the-art deep learning.