Prediction of forest unit volume based on hybrid feature selection and ensemble learning (original) (raw)

Abstract

Aiming at the characteristics of forestry data with high dimensionality and complex samples, this paper explores an ensemble learning method suitable for predicting forest unit volume, which provides a scientific basis for forest resource management and decision-making. According to the real data provided by the National Forestry Science Data Sharing Service Platform, a FL-Stacking model based on hybrid feature selection and ensemble learning is proposed. Firstly, the model extracts features based on Filter-Lasso hybrid method, then constructs the prediction model of forest unit volume based on ensemble learning, and uses eight prediction models such as Linear SVM regression as the fusion basis model in the training set by Stacking scheme. The data are verified by 10 folds cross-validation. Finally, the fusion and optimization of the basic model are carried out. The experimental results show that the optimal accuracy of the single model is 83.81%, the multi-model predicted by FL-Stacking model is 84.55%, and the R2 value is increased by 0.74 percentage points. The comparative analysis results of different models on real data sets show that the FL-Stacking integrated prediction model proposed in this paper has a high accuracy in estimating forest unit volume, and has a great practical research value.

Access this article

Log in via an institution

Subscribe and save

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. State Forestry Administration (2014) Results of the eighth national forest resources inventory. For Res Manage (1):1–2
  2. Robert N, John E, Gary M (2009) Handbook of statistical analysis and data mining applications. Elsevier, USA
    MATH Google Scholar
  3. Vapnik V, Levin E, Le CY (1994) Measuring the VC dimension of a learning machine. Neural Comput 6:851–876
    Article Google Scholar
  4. Cover TM, Hart PE (1953) Nearest neighbor pattern classification. IEEE Trans Inf Theory 13:21–27
    Article Google Scholar
  5. Breiman L (2001) Random forests. Mach Learn 45(1):5–32
    Article Google Scholar
  6. Grossmann E (2004) Ada tree: boosting a weak classifier into a decision tree. In: Proceedings of the 2004 conference on computer vision and pattern recognition workshop. p 105
  7. Breiman L (1996) Bagging predictors. Mach Learn 24:123–140
    MATH Google Scholar
  8. Zhang C-X, Zhang L-S (2011) A review of selective integrated learning algorithms. Chin J Comput 34(08):1399–1410
    Article Google Scholar
  9. Heng W, Kunliang D, Xianglin T, Shuichao S, Jun CS, Pengxiang Z, Tianjian C (2015) Evaluation of site quality of natural secondary forest and artificial forest in Qinling forest region. Sci Sci Technol 51(04):78–88
    Google Scholar
  10. Pan Y, Raynal DJ (1995) Predicting growth of plantation conifers in the Adirondack Mountains in response to climate change. Can J For Res 25:48–56
    Article Google Scholar
  11. Worrell R, Malcolm DC (1990) Productivity of sitka spruce in Northern Britain 1. The effects of elevation and climate. Forestry 63:105–118
    Article Google Scholar
  12. Hassall RB, Macmillan DC, Miller HG(1994) Predicting sitka spruce yields in the Buchan area of North-East Scotland. Forestry 67:219–235
    Article Google Scholar
  13. Dong W, Zhou G, Xia L et al (1979) Quantitative theory and its application. Jilin People’s Publishing House, Changchun
    Google Scholar
  14. Ashraf MI, Zhao ZY, Bourque CPA et al (2013) Integrating biophysical controls in forest growth and yield predictions with artificial intelligence technology. Can J For Res 43:1162–1171
    Article Google Scholar
  15. Jensen JR, Qiu F, Ji M (1999) Predictive modeling of coniferous forest age using statistical and artificial neural network approaches applied to remote sensor data. Int J Remote Sens 20(14):2805–2822
    Article Google Scholar
  16. Guan BT, Gertner G (1991) Using a parallel distributed prcessing system to model individual tree mortality. For Sci 37:871–885
    Google Scholar
  17. Guan BT, Gertner G (1991) Modeling red pine tree survival with an artificial neural network. For Sci 37:1429–1440
    Google Scholar
  18. De’Ath G (2007) Boosted trees for ecological modeling and prediction. Ecology 88:243–251
    Article Google Scholar
  19. Kuhn M, Johnson K, Lin H (2013) Applied predictive modeling. China Machine Press, Beijing
    Book Google Scholar
  20. Yang RM, Zhang GL, Liu F et al (2016) Comparison of boosted regres sion tree and random forest models for mapping topsoil organic carbon concentration in an alpine ecosystem. Ecol Ind 60:870–878
    Article Google Scholar
  21. Yin C, Liu M, Sun F-Y et al (2016) Influencing factors of non-point source pollution of watershed based on boosted regression tree algorithm. Chin J Appl Ecol 27(3):911–19
    Google Scholar
  22. Ou Q-X, Li H-K, Yang Y (2017) Factors affecting the biomass conversion and expansion factor of masson pine in Fujian Province. Acta Ecol Sin 37(17):5756–5764
    Google Scholar
  23. Ou Q, Li H et al (2018) Comparison of biomass conversion and expansion factor estimation of Pinus massoniana in Fujian based on inventory data—comparison of -3 ensemble learning decision tree models. Chin J Appl Ecol 29(06):2007–2016
    Google Scholar
  24. Ding L, Luo P (2017) Research on early warning of default risk of P2P online loans based on Staking integration strategy. Invest Res 36(04):41–54
    Google Scholar
  25. Ye S, Wang X et al (2011) Transient stability assessment of power system based on stacking meta-learning strategy. Power Syst Prot Control 39(06):12–16
    Google Scholar

Download references

Acknowledgements

This work was supported by Social Science Project of Beijing Education Commission (SM201910028017) and Capacity Building for Sci-Tech Innovation - Fundamental Scientific Research Funds of Beijing Education Commission (Grant no.19530050142). Thanks for the China National Forestry Science Data Sharing Service Platform’s Second-Class Survey and Related Data.

Author information

Authors and Affiliations

  1. School of Management, Capital Normal University, Beijing, 100048, China
    Jie Wang, Jing Xu, Yan Peng & Hongpeng Wang
  2. Metropolitan College, Boston University, 1010 Commonwealth Ave, Boston, MA, 02215, USA
    Junhao Shen

Authors

  1. Jie Wang
  2. Jing Xu
  3. Yan Peng
  4. Hongpeng Wang
  5. Junhao Shen

Corresponding authors

Correspondence toJie Wang or Junhao Shen.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

About this article

Cite this article

Wang, J., Xu, J., Peng, Y. et al. Prediction of forest unit volume based on hybrid feature selection and ensemble learning.Evol. Intel. 13, 21–32 (2020). https://doi.org/10.1007/s12065-019-00219-4

Download citation

Keywords