Estimating vegetation height and canopy cover from remotely sensed data with machine learning (original) (raw)
Related papers
ISPRS Journal of Photogrammetry and Remote Sensing, 2015
Many forest management activities, including the development of forest inventories, require spatially detailed forest canopy cover and height data. Among the various remote sensing technologies, LiDAR (Light Detection and Ranging) offers the most accurate and consistent means for obtaining reliable canopy structure measurements. A potential solution to reduce the cost of LiDAR data, is to integrate transects (samples) of LiDAR data with frequently acquired and spatially comprehensive optical remotely sensed data. Although multiple regression is commonly used for such modeling, often it does not fully capture the complex relationships between forest structure variables. This study investigates the potential of Random Forest (RF), a machine learning technique, to estimate LiDAR measured canopy structure using a time series of Landsat imagery. The study is implemented over a 2600 ha area of industrially managed coastal temperate forests on Vancouver Island, British Columbia, Canada. We implemented a trajectory-based approach to time series analysis that generates time since disturbance (TSD) and disturbance intensity information for each pixel and we used this information to stratify the forest land base into two strata: mature forests and young forests. Canopy cover and height for three forest classes (i.e. mature, young and mature and young (combined)) were modeled separately using multiple regression and Random Forest (RF) techniques. For all forest classes, the RF models provided improved estimates relative to the multiple regression models. The lowest validation error was obtained for the mature forest strata in a RF model (R 2 = 0.88, RMSE = 2.39 m and bias = À0.16 for canopy height; R 2 = 0.72, RMSE = 0.068% and bias = À0.0049 for canopy cover). This study demonstrates the value of using disturbance and successional history to inform estimates of canopy structure and obtain improved estimates of forest canopy cover and height using the RF algorithm.
Characterizing forest canopy structure with lidar composite metrics and machine learning
A lack of reliable observations for canopy science research is being partly overcome by the gradual use of lidar remote sensing. This study aims to improve lidar-based canopy characterization with airborne laser scanners through the combined use of lidar composite metrics and machine learning models. Our so-called composite metrics comprise a relatively large number of lidar predictors that tend to retain as much information as possible when reducing raw lidar point clouds into a format suitable as inputs to predictive models of canopy structural variables. The information-rich property of such composite metrics is further complemented by machine learning, which offers an array of supervised learning models capable of relating canopy characteristics to high-dimensional lidar metrics via complex, potentially nonlinear functional relationships. Using coincident lidar and field data over an Eastern Texas forest in USA, we conducted a case study to demonstrate the ubiquitous power of the lidar composite metrics in predicting multiple forest attributes and also illustrated the use of two kernel machines, namely, support vector machine and Gaussian processes (GP). Results show that the two machine learning models in conjunction with the lidar composite metrics outperformed traditional approaches such as the maximum likelihood classifier and linear regression models. For example, the five-fold cross validation for GP regression models (vs. linear/log-linear models) yielded a root mean squared error of 1.06 (2.36) m for Lorey's height, 0.95 (3.43) m for dominant height, 5.34 (8.51) m 2 /ha for basal area, 21.4 (40.5) Mg/ha for aboveground biomass, 6.54 (9.88) Mg/ha for belowground biomass, 0.75 (2.76) m for canopy base height, 2.2 (2.76) m for canopy ceiling height, 0.015 (0.02) kg/m 3 for canopy bulk density, 0.068 (0.133) kg/m 2 for available canopy fuel, and 0.33 (0.39) m 2 /m 2 for leaf area index. Moreover, uncertainty estimates from the GP regression were more indicative of the true errors in the predicted canopy variables than those from their linear counterparts. With the ever-increasing accessibility of multisource remote sensing data, we envision a concomitant expansion in the use of advanced statistical methods, such as machine learning, to explore the potentially complex relationships between canopy characteristics and remotely-sensed predictors, accompanied by a desideratum for improved error analysis.
Remotely sensed estimation of forest canopy density: A comparison of the performance of four methods
International journal of …, 2006
In recent years, a number of alternative methods have been proposed to predict forest canopy density from remotely sensed data. To date, however, it remains difficult to decide which method to use, since their relative performance has never been evaluated. In this study the performance of: (1) an artificial neural network, (2) a multiple linear regression, (3) the forest canopy density mapper and (4) a maximum likelihood classification method was compared for prediction of forest canopy density using a Landsat ETM+ image. Comparison of confusion matrices revealed that the regression model performed significantly worse than the three other methods. These results were based on a z-test for comparison of weighted kappa statistics, which is an appropriate statistic for analysis of ranked categories. About 89% of the variance of the observed canopy density was explained by the artificial neural networks, which outperformed the other three methods in this respect. Moreover, the artificial neural networks gave an unbiased prediction, while other methods systematically under or over predicted forest canopy density. The choice of biased method could have a high impact on canopy density inventories. #
Remote Sensing of Environment, 2015
This study proposed modifying the conceptual approach that is commonly used to model development of stand attribute estimates using airborne LiDAR data. New models were developed using an area-based approach to predict wood volume, stem volume, aboveground biomass, and basal-area across a wide range of canopy structures, sites and LiDAR characteristics. This new modeling approach does not adopt standard approaches of stepwise regression using a series of height metrics derived from airborne LiDAR. Rather, it used four metrics describing complementary 3D structural aspects of the stand canopy. The first three metrics were related to mean canopy height, height heterogeneity, and horizontal canopy distribution. A fourth metric was calculated as the coefficient of variation of the leaf area density profile. This fourth metric provided information on understory vegetation. The models that were developed with the four structural metrics provided higher estimation accuracy on stand attributes than models using height metrics alone, while also avoiding data over-fitting. Overall, the models provided prediction error levels ranging from 12.4% to 24.2%, depending upon forest type and stand attribute. The more homogeneous coniferous stand provided the highest estimation accuracy. Estimation errors were significantly reduced in mixed forest when separate models were developed for individual stand types (coniferous, mixed and deciduous stands) instead of a general model for all stand types. Model robustness was also evaluated in leaf-off and leaf-on conditions where both conditions provided similar estimation errors.
Using decision trees to predict forest stand height and canopy cover from LANSAT and LIDAR data
Managing environmental knowledge: EnviroInfo, 2006
The motivation for this study was to improve the consistency and accuracy, and increase the spatial resolution of some of the supporting information to the forest monitoring system in Slovenia by using data mining techniques. Specifically we aim to generate raster maps with 25 m horizontal resolution of forest stand height and canopy cover, for the Kras region of Slovenia. We used predictive models based on multi-temporal Landsat data and calibrated it with high resolution airborne laser scanning (ALS) data. The visual inspection ...
A common challenge when comparing forest canopy cover and similar metrics across different ecosystems is that there are many field-and landscape-level measurement methods. This research conducts a cross-comparison and evaluation of forest canopy cover metrics produced using unmixing of reflective spectral satellite data, light detection and ranging (lidar) data, and data collected in the field with spherical densiometers. The coincident data were collected across ã 25 000 ha mixed conifer forest in northern Idaho. The primary objective is to evaluate whether the spectral and lidar canopy cover metrics are each statistically equivalent to the field-based metrics. The secondary objective is to evaluate whether the lidar data can elucidate the sources of error observed in the spectral-based canopy cover metrics. The statistical equivalence tests indicate that spectral and field data are not equivalent (slope region of equivalence = 43%). In contrast, the lidar and field data are within the acceptable error margin of most forest inventory assessments (slope region of equivalence = 13%). The results also show that in plots where the mean lidar plot heights are near zero, each of modeled remotely sensed estimates continues to report canopy cover >21% for lidar and >30% for all investigated spectral methods using near-infrared bands. This suggests these metrics are sensitive to the presence of herbaceous vegetation, shrubs, seedlings, saplings, and other subcanopy vegetation.
International Journal of Electrical and Computer Engineering (IJECE), 2023
Accurate estimation of forest canopy height is essential for monitoring forest ecosystems and assessing their carbon storage potential. This study evaluates the effectiveness of different remote sensing techniques for estimating forest canopy height in tropical dry forests. Using field data and remote sensing data from airborne lidar and polarimetric synthetic aperture radar (SAR), a random forest (RF) model was developed to estimate canopy height based on different indices. Results show that the normalize difference build-up index (NDBI) has the highest correlation with canopy height, outperforming other indices such as relative vigor index (RVI) and polarimetric vertical and horizontal variables. The RF model with NDBI as input showed a good fit and predictive ability, with low concentration of errors around 0. These findings suggest that NDBI can be a useful tool for accurately estimating forest canopy height in tropical dry forests using remote sensing techniques, providing valuable information for forest management and conservation efforts.
International Journal of Applied Earth Observation and Geoinformation
Spatially-explicit information on forest structure is paramount to estimating aboveground carbon stocks for designing sustainable forest management strategies and mitigating greenhouse gas emissions from deforestation and forest degradation. LiDAR measurements provide samples of forest structure that must be integrated with satellite imagery to predict and to map landscape scale variations of forest structure. Here we evaluate the capability of existing satellite synthetic aperture radar (SAR) with multispectral data to estimate forest canopy height over five study sites across two biomes in North America, namely temperate broadleaf and mixed forests and temperate coniferous forests. Pixel size affected the modelling results, with an improvement in model performance as pixel resolution coarsened from 25 m to 100 m. Likewise, the sample size was an important factor in the uncertainty of height prediction using the Support Vector Machine modelling approach. Larger sample size yielded better results but the improvement stabilised when the sample size reached approximately 10% of the study area. We also evaluated the impact of surface moisture (soil and vegetation moisture) on the modelling approach. Whereas the impact of surface moisture had a moderate effect on the proportion of the variance explained by the model (up to 14%), its impact was more evident in the bias of the models with bias reaching values up to 4 m. Averaging the incidence angle corrected radar backscatter coefficient (γ°) reduced the impact of surface moisture on the models and improved their performance at all study sites, with R 2 ranging between 0.61 and 0.82, RMSE between 2.02 and 5.64 and bias between 0.02 and −0.06, respectively, at 100 m spatial resolution. An evaluation of the relative importance of the variables in the model performance showed that for the study sites located within the temperate broadleaf and mixed forests biome ALOS-PALSAR HV polarised backscatter was the most important variable, with Landsat Tasselled Cap Transformation components barely contributing to the models for two of the study sites whereas it had a significant contribution at the third one. Over the temperate conifer forests, Landsat Tasselled Cap variables contributed more than the ALOS-PALSAR HV band to predict the landscape height variability. In all cases, incorporation of multispectral data improved the retrieval of forest canopy height and reduced the estimation uncertainty for tall forests. Finally, we concluded that models trained at one study site had higher uncertainty when applied to other sites, but a model developed from multiple sites performed equally to site-specific models to predict forest canopy height. This result suggest that a biome level model developed from several study sites can be used as a reliable estimator of biome-level forest structure from existing satellite imagery.