IMG2nDSM: Height Estimation from Single Airborne RGB Images with Deep Learning (original) (raw)

Focusing on Shadows for Predicting Heightmaps from Single Remotely Sensed RGB Images with Deep Learning

2021

Estimating the heightmaps of buildings and vegetation in single remotely sensed images is a challenging problem. Effective solutions to this problem can comprise the stepping stone for solving complex and demanding problems that require 3D information of aerial imagery in the remote sensing discipline, which might be expensive or not feasible to require. We propose a task-focused Deep Learning (DL) model that takes advantage of the shadow map of a remotely sensed image to calculate its heightmap. The shadow is computed efficiently and does not add significant computation complexity. The model is trained with aerial images and their Lidar measurements, achieving superior performance on the task. We validate the model with a dataset covering a large area of Manchester, UK, as well as the 2018 IEEE GRSS Data Fusion Contest Lidar dataset. Our work suggests that the proposed DL architecture and the technique of injecting shadows information into the model are valuable for improving the h...

Generating Elevation Surface from a Single RGB Remotely Sensed Image Using Deep Learning

2020

Generating Digital Elevation Models (DEM) from satellite imagery or other data sources constitutes an essential tool for a plethora of applications and disciplines, ranging from 3D flight planning and simulation, autonomous driving and satellite navigation, such as GPS, to modeling water flow, precision farming and forestry. The task of extracting this 3D geometry from a given surface hitherto requires a combination of appropriately collected corresponding samples and/or specialized equipment, as inferring the elevation from single image data is out of reach for contemporary approaches. On the other hand, Artificial Intelligence (AI) and Machine Learning (ML) algorithms have experienced unprecedented growth in recent years as they can extrapolate rules in a data-driven manner and retrieve convoluted, nonlinear one-to-one mappings, such as an approximate mapping from satellite imagery to DEMs. Therefore, we propose an end-to-end Deep Learning (DL) approach to construct this mapping a...

Deep Neural Networks for Determining the Parameters of Buildings from Single-Shot Satellite Imagery

Journal of Computer and Systems Sciences International

The height of a building is a basic characteristic needed for analytical services. It can be used to evaluate the population and functional zoning of a region. The analysis of the height structure of urban territories can be useful for understanding the population dynamics. In this paper, a novel method for determining a building's height from a single-shot Earth remote sensing oblique image is proposed. The height is evaluated by a simulation algorithm that uses the masks of shadows and the visible parts of the walls. The image is segmented using convolutional neural networks that makes it possible to extract the masks of roofs, shadows, and building walls. The segmentation models are integrated into a completely automatic system for mapping buildings and evaluating their heights. The test dataset containing a labeled set of various buildings is described. The proposed method is tested on this dataset, and it demonstrates the mean absolute error of less than 4 meters.

Y-Shaped Convolutional Neural Network for 3D Roof Elements Extraction to Reconstruct Building Models from a Single Aerial Image

ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences

Fast and efficient detection and reconstruction of buildings have become essential in real-time applications such as navigation, 3D rendering, augmented reality, and 3D smart cities. In this study, a modern Deep Learning (DL)-based framework is proposed for automatic detection, localization, and height estimation of buildings, simultaneously, from a single aerial image. The proposed framework is based on a Y-shaped Convolutional Neural Network (Y-Net) which includes one encoder and two decoders. The input of the network is a single RGB image, while the outputs are predicted height information of buildings as well as the rooflines in three classes of eave, ridge, and hip lines. The extracted knowledge by the Y-Net (i.e. buildings' heights and rooflines) is utilized for 3D reconstruction of buildings based on the third Level of Detail (LoD2). The main steps of the proposed approach are data preparation, CNNs training, and 3D reconstruction. For the experimental investigations airborne data from Potsdam are used, which were provided by ISPRS. For the predicted heights, the results show an average Root Mean Square Error (RMSE) and a Normalized Median Absolute Deviation (NMAD) of about 3.8 m and 1.3 m, respectively. Moreover, the overall accuracy of the extracted rooflines is about 86%.

Extraction of linear structures from digital terrain models using deep learning

2021

Abstract. This paper explores the role deep convolutional neural networks play in automated extraction of linear structures using semantic segmentation techniques in Digital Terrain Models (DTMs). DTM is a regularly gridded raster created from laser scanning point clouds and represents elevations of the bare earth surface with respect to a reference. Recent advances in Deep Learning (DL) have made it possible to explore the use of semantic segmentation for detection of terrain structures in DTMs. This research examines two novel and practical deep convolutional neural network architectures i.e. an encoder-decoder network named as SegNet and the recent state-of-the-art high-resolution network (HRNet). This paper initially focuses on the pixel-wise binary classification in order to validate the applicability of the proposed approaches. The networks are trained to distinguish between points belonging to linear structures and those belonging to background. In the second step, multi-clas...

2D Image-To-3D Model: Knowledge-Based 3D Building Reconstruction (3DBR) Using Single Aerial Images and Convolutional Neural Networks (CNNs)

Remote Sensing

In this study, a deep learning (DL)-based approach is proposed for the detection and reconstruction of buildings from a single aerial image. The pre-required knowledge to reconstruct the 3D shapes of buildings, including the height data as well as the linear elements of individual roofs, is derived from the RGB image using an optimized multi-scale convolutional–deconvolutional network (MSCDN). The proposed network is composed of two feature extraction levels to first predict the coarse features, and then automatically refine them. The predicted features include the normalized digital surface models (nDSMs) and linear elements of roofs in three classes of eave, ridge, and hip lines. Then, the prismatic models of buildings are generated by analyzing the eave lines. The parametric models of individual roofs are also reconstructed using the predicted ridge and hip lines. The experiments show that, even in the presence of noises in height values, the proposed method performs well on 3D r...

Beyond Measurement: Extracting Vegetation Height from High Resolution Imagery with Deep Learning

Remote Sensing

Measuring and monitoring the height of vegetation provides important insights into forest age and habitat quality. These are essential for the accuracy of applications that are highly reliant on up-to-date and accurate vegetation data. Current vegetation sensing practices involve ground survey, photogrammetry, synthetic aperture radar (SAR), and airborne light detection and ranging sensors (LiDAR). While these methods provide high resolution and accuracy, their hardware and collection effort prohibits highly recurrent and widespread collection. In response to the limitations of current methods, we designed Y-NET, a novel deep learning model to generate high resolution models of vegetation from highly recurrent multispectral aerial imagery and elevation data. Y-NET’s architecture uses convolutional layers to learn correlations between different input features and vegetation height, generating an accurate vegetation surface model (VSM) at 1×1 m resolution. We evaluated Y-NET on 235 km...

Height Prediction and Refinement From Aerial Images With Semantic and Geometric Guidance

IEEE Access

Deep learning provides a powerful new approach to many computer vision tasks. Height prediction from aerial images is one of those tasks which benefited greatly from the deployment of deep learning, thus replacing traditional multi-view geometry techniques. This manuscript proposes a two-stage approach to solve this task, where the first stage is a multi-task neural network whose main branch is used to predict the height map resulting from a single RGB aerial input image, while being augmented with semantic and geometric information from two additional branches. The second stage is a refinement step, where a denoising autoencoder is used to correct some errors in the first stage prediction results, producing a more accurate height map. Experiments on two publicly available datasets show that the proposed method is able to outperform state-of-the-art computer vision based and deep learning-based height prediction methods.

Automated Building Detection from Airborne LiDAR and Very High-Resolution Aerial Imagery with Deep Neural Network

Remote Sensing

The detection of buildings in the city is essential in several geospatial domains and for decision-making regarding intelligence for city planning, tax collection, project management, revenue generation, and smart cities, among other areas. In the past, the classical approach used for building detection was by using the imagery and it entailed human–computer interaction, which was a daunting proposition. To tackle this task, a novel network based on an end-to-end deep learning framework is proposed to detect and classify buildings features. The proposed CNN has three parallel stream channels: the first is the high-resolution aerial imagery, while the second stream is the digital surface model (DSM). The third was fixed on extracting deep features using the fusion of channel one and channel two, respectively. Furthermore, the channel has eight group convolution blocks of 2D convolution with three max-pooling layers. The proposed model’s efficiency and dependability were tested on thr...

On the Exploration of Automatic Building Extraction from RGB Satellite Images Using Deep Learning Architectures Based on U-Net

Technologies

Detecting and localizing buildings is of primary importance in urban planning tasks. Automating the building extraction process, however, has become attractive given the dominance of Convolutional Neural Networks (CNNs) in image classification tasks. In this work, we explore the effectiveness of the CNN-based architecture U-Net and its variations, namely, the Residual U-Net, the Attention U-Net, and the Attention Residual U-Net, in automatic building extraction. We showcase their robustness in feature extraction and information processing using exclusively RGB images, as they are a low-cost alternative to multi-spectral and LiDAR ones, selected from the SpaceNet 1 dataset. The experimental results show that U-Net achieves a 91.9% accuracy, whereas introducing residual blocks, attention gates, or a combination of both improves the accuracy of the vanilla U-Net to 93.6%, 94.0%, and 93.7%, respectively. Finally, the comparison between U-Net architectures and typical deep learning appro...