Deconvolutional networks for point-cloud vehicle detection and tracking in driving scenarios (original) (raw)
Related papers
Dual-Branch CNNs for Vehicle Detection and Tracking on LiDAR Data
IEEE Transactions on Intelligent Transportation Systems, 2020
We present a novel vehicle detection and tracking system that works solely on 3D LiDAR information. Our approach segments vehicles using a dual-view representation of the 3D LiDAR point cloud on two independently trained convolutional neural networks, one for each view. A bounding box growing algorithm is applied to the fused output of the networks to properly enclose the segmented vehicles. Bounding boxes are grown using a probabilistic method that takes into account also occluded areas. The final vehicle bounding boxes act as observations for a multi-hypothesis tracking system which allows to estimate the position and velocity of the observed vehicles. We thoroughly evaluate our system on the KITTI benchmarks both for detection and tracking separately and show that our dualbranch classifier consistently outperforms previous single-branch approaches, improving or directly competing to other state of the art LiDAR-based methods. Index Terms-Deep convolutional neural network, vehicle detection and tracking, LiDAR, point cloud.
Vehicle Detection and Localization using 3D LIDAR Point Cloud and Image Semantic Segmentation
2018 21st International Conference on Intelligent Transportation Systems (ITSC), 2018
This paper presents a real-time approach to detect and localize surrounding vehicles in urban driving scenes. We propose a multimodal fusion framework that processes both 3D LIDAR point cloud and RGB image to obtain robust vehicle position and size in a Bird's Eye View (BEV). Semantic segmentation from RGB images is obtained using our efficient Convolutional Neural Network (CNN) architecture called ERFNet. Our proposal takes advantage of accurate depth information provided by LIDAR and detailed semantic information processed from a camera. The method has been tested using the KITTI object detection benchmark. Experiments show that our approach outperforms or is on par with other state-of-the-art proposals but our CNN was trained in another dataset, showing a good generalization capability to any domain, a key point for autonomous driving.
2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC)
While 3D LiDAR has become a common practice for more and more autonomous driving systems, precise 3D mapping and robust localization is of great importance. However, current 3D map is always noisy and unreliable due to the existence of moving objects, leading to worse localization. In this paper, we propose a general vehicle-free point cloud mapping framework for better on-vehicle localization. For each laser scan, vehicle points are detected, tracked and then removed. Simultaneously, 3D map is reconstructed by registering each vehicle-free laser scan to global coordinate based on GPS/INS data. Instead of direct 3D object detection from point cloud, we first detect vehicles from RGB images using the proposed YVDN (YOLOv2 Vehicle Detection Network). In case of false or missing detection, which may result in the existence of vehicles in the map, we propose the K-Frames forward-backward object tracking algorithm to link detection from neighborhood images. Laser scan points falling into the detected bounding boxes are then removed. We conduct our experiments on the Oxford RobotCar Dataset [1] and show the qualitative results to validate the feasibility of our vehicle-free 3D mapping system. Besides, our vehicle-free mapping system can be generalized to any autonomous driving system equipped with LiDAR, camera and/or GPS.
The rapid development of Deep Learning brought novel methodologies for 3D Object Detection using LiDAR sensing technology. These improvements in precision and inference speed performances lead to notable high performance and real-time inference, which is especially important for self-driving purposes. However, the developments carried by these approaches overwhelm the research process in this area since new methods, new technologies, and software versions lead to different project necessities, specifications and requirements. Moreover, the improvements brought by the new methods may be due to improvements in newer versions of deep learning frameworks and not just the novelty and innovation of the model architecture. Thus, it became crucial to create a framework with the same software versions, specifications and requirements that accommodate all these methodologies and allow the easy introduction of new methods and models. A framework is proposed that abstracts the implementation, r...
Advances in Intelligent Systems and Computing
Self-driving cars (or autonomous cars) can sense and navigate through an environment without any driver intervention. To achieve this task, they rely on vision sensors working in tandem with accurate algorithms to detect movable and non-movable objects around them. These vision sensors typically include cameras to identify static and non-static objects, Radio Detection and Ranging (RADAR) to detect the speed of the moving objects using Doppler effect and Light Detection and Ranging (LiDAR) to detect the distance to objects. In this paper, we explore a new usage of LiDAR data to classify static objects on the road. We present a pipeline to classify point cloud data grouped in volumetric pixels (voxels). We introduce a novel approach to point cloud data representation for processing within Convolution Neural Networks (CNN). Results show an accuracy exceeding 90% in the detection and classification of road edges, solid and broken lane markings, bike lanes, and lane center lines. Our data pipeline is capable of processing up to 20,000 points per 900ms on a server equipped with 2 Intel Xeon processors 8-core CPU with HyperThreading for a total of 32 threads and 2 NVIDIA Tesla K40 GPUs. Our model outperforms by 2% ResNet applied to camera images for the same road.
2022
3D object detection plays a fundamental role in enabling automated driving, which is regarded as the significant leap forward for contemporary transportation systems from the perspectives of safety, mobility, and sustainability. Most of the state-of-the-art object detection methods from point clouds are developed based on a single onboard LiDAR, whose performance will be inevitably limited by the range and occlusion, especially in dense traffic scenarios. In this paper, we propose PillarGrid, a novel cooperative perception method fusing information from multiple 3D LiDARs (both on-board and roadside), to enhance the situation awareness for connected and automated vehicles (CAVs). PillarGrid consists of four main components: 1) cooperative preprocessing of point clouds, 2) pillar-wise voxelization and feature extraction, 3) grid-wise deep fusion of features from multiple sensors, and 4) convolutional neural network (CNN)-based augmented 3D object detection. A novel cooperative perception platform is developed for model training and testing. Extensive experimentation shows that PillarGrid outperforms other single-LiDAR-based 3D object detection methods concerning both accuracy and range by a large margin.
End-to-End 3D Object Detection using LiDAR Point Cloud
arXiv (Cornell University), 2023
There has been significant progress made in the field of autonomous vehicles. Object detection and tracking are the primary tasks for any autonomous vehicle. The task of object detection in autonomous vehicles relies on a variety of sensors like cameras, and LiDAR. Although image features are typically preferred, numerous approaches take spatial data as input. Exploiting this information we present an approach wherein, using a novel encoding of the LiDAR point cloud we infer the location of different classes near the autonomous vehicles. This approach does not implement a bird's eye view approach, which is generally applied for this application and thus saves the extensive pre-processing required. After studying the numerous networks and approaches used to solve this approach, we have implemented a novel model with the intention to inculcate their advantages and avoid their shortcomings. The output is predictions about the location and orientation of objects in the scene in form of 3D bounding boxes and labels of scene objects.
Deep learning for LiDAR-only and LiDAR-fusion 3D perception: a survey
Intelligence & Robotics, 2022
The perception system for robotics and autonomous cars relies on the collaboration among multiple types of sensors to understand the surrounding environment. LiDAR has shown great potential to provide accurate environmental information, and thus deep learning on LiDAR point cloud draws increasing attention. However, LiDAR is unable to handle severe weather. The sensor fusion between LiDAR and other sensors is an emerging topic due to its supplementary property compared to a single LiDAR. Challenges exist in deep learning methods that take LiDAR point cloud fusion data as input, which need to seek a balance between accuracy and algorithm complexity due to data redundancy. This work focuses on a comprehensive survey of deep learning on LiDAR-only and LiDAR-fusion 3D perception tasks. Starting with the representation of LiDAR point cloud, this paper then introduces its unique characteristics and the evaluation dataset as well as metrics. This paper gives a review according to four key tasks in the field of LiDAR-based perception: object classification, object detection, object tracking, and segmentation (including semantic segmentation and instance segmentation). Finally, we present the overlooked aspects of the current algorithms and possible solutions, hoping this paper can serve as a reference for the related research.
Autonomous Vehicles Perception (AVP) Using Deep Learning: Modeling, Assessment, and Challenges
IEEE Access
Perception is the fundamental task of any autonomous driving system, which gathers all the necessary information about the surrounding environment of the moving vehicle. The decision-making system takes the perception data as input and makes the optimum decision given that scenario, which maximizes the safety of the passengers. This paper surveyed recent literature on autonomous vehicle perception (AVP) by focusing on two primary tasks: Semantic Segmentation and Object Detection. Both tasks play an important role as a vital component of the vehicle's navigation system. A comprehensive overview of deep learning for perception and its decision-making process based on images and LiDAR point clouds is discussed. We discussed the sensors, benchmark datasets, and simulation tools widely used in semantic segmentation and object detection tasks, especially for autonomous driving. This paper acts as a road map for current and future research in AVP, focusing on models, assessment, and challenges in the field.