首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于三种机器学习算法的山洪灾害风险评价
引用本文:周超,方秀琴,吴小君,王雨晨.基于三种机器学习算法的山洪灾害风险评价[J].地球信息科学,2019,21(11):1679-1688.
作者姓名:周超  方秀琴  吴小君  王雨晨
作者单位:河海大学地球科学与工程学院,南京 211100
基金项目:国家重点研发计划项目(No.2016YFA0601500)
摘    要:依据洪灾风险概念模型,从触发因子、孕灾环境和承灾体3方面选取江西省的12个洪灾风险指标,采用k近邻、随机森林、AdaBoost 3种机器学习算法构建洪灾风险评价模型。利用精度、Kappa系数、ROC曲线(AUC值)3种定量评估指标评价洪灾风险模型,基于随机森林和Boruta特征提取算法共同分析指标重要性,最后对比3种模型绘制的江西省山洪灾害风险分区图并分析山洪灾害分布特征。结果表明:① AdaBoost模型的精度、Kappa系数和AUC值的平均值为别为0.902、0.870和0.826,精度和Kappa系数略优于随机森林,AUC值与随机森林相当,而k近邻模型的3种性能指标均低于前2种算法;② 农田生产潜力、年最大6 h暴雨均值、年最大1 h暴雨均值、归一化差值植被指数、年降雨量均值这5个指标对最终的洪灾风险形成具有非常重要作用;③ 江西省较高风险区与最高风险区的面积和约占江西省总面积的34.4%,且主要分布于高降雨量、高暴雨量、农田生产潜力大的山区。

关 键 词:随机森林机器学习算法  AdaBoost机器学习算法  ROC曲线  Boruta算法  洪灾风险评价  江西省  
收稿时间:2019-04-23

Risk Assessment of Mountain Torrents based on Three Machine Learning Algorithms
ZHOU Chao,FANG Xiuqin,WU Xiaojun,WANG Yuchen.Risk Assessment of Mountain Torrents based on Three Machine Learning Algorithms[J].Geo-information Science,2019,21(11):1679-1688.
Authors:ZHOU Chao  FANG Xiuqin  WU Xiaojun  WANG Yuchen
Institution:School of Earth Sciences and Engineering, Hohai University, Nanjing 211100, China
Abstract:In China, floods are considered the most frequent natural disaster that can cause serious damages to the safety of human beings and severe economic losses. We chose Jiangxi Province as the study area, which frequently suffered from mountain torrents. According to the conceptual model of flood risk, 12 flood risk assessment indexes were selected from three aspects: trigger factor, hazard inducing environment, and hazard bearing agent. Three models of flood risk assessment were constructed using different machine learning algorithms, including k-Nearest Neighbor (kNN), Random Forest (RF), and AdaBoost. To evaluate the models' performances, we applied three quantitative performance indexes: accuracy, Kappa coefficient, and the ROC curve (AUC value). We analyzed the importance of indexes based on Random Forest algorithm and the feature extraction algorithm of Boruta. Then, the zoning maps of mountain flood risk drawn by the three models were used to compare and analyze the pattern of mountain flood disasters. According to the outcomes of the performance analysis, the average values of accuracy, Kappa coefficient, and AUC of the AdaBoost model were 0.902, 0.870, and 0.826, respectively. The accuracy and Kappa coefficient were slightly higher than RF, the AUC value was equivalent to RF. The three performance indexes of the kNN model were all lower than those of the other two. Our findings suggest that five indexes play very important roles in the formation of the final flood disaster risk, including potential farmland productivity, average annual maximum rainstorm within six hours, average annual maximum rainstorm within one hour, NDVI, and average annual rainfall. Our mapping results show that the areas of higher and highest risk zones account for 34.4% of Jiangxi Province. The regions with higher and highest risk are mainly distributed in the vicinity of mountains with high rainfall, heavy rainstorm, and high potential of farmland production.
Keywords:Random Forest Machine Learning Algorithm  AdaBoost Machine Learning Algorithm  ROC curve  Boruta algorithm  flood risk assessment  Jiangxi Province  
本文献已被 CNKI 等数据库收录!
点击此处可从《地球信息科学》浏览原始摘要信息
点击此处可从《地球信息科学》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号