首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于相关分析和自适应遗传算法的盐渍化建模变量和参数优选
引用本文:徐红涛,陈春波,郑宏伟,罗格平,杨辽,王伟胜,吴世新.基于相关分析和自适应遗传算法的盐渍化建模变量和参数优选[J].地球信息科学,2020,22(7):1497-1509.
作者姓名:徐红涛  陈春波  郑宏伟  罗格平  杨辽  王伟胜  吴世新
作者单位:1.中国科学院新疆生态与地理研究所,荒漠与绿洲生态国家重点实验室,乌鲁木齐 830011;2.中国科学院大学,北京 100049
基金项目:国家自然科学基金项目(41877012);中国科学院“一带一路”团队项目(2018-YDYLTD-002);中国科学院特色研究所项目(TSS-2015-014-FW-1-3)
摘    要:机器学习结合遥感等其他数据反演土壤盐分含量(Soil Salt Content, SSC)较少关注对模型精度影响较大的建模特征变量和模型参数的优选。本文基于自适应遗传算法(Adaptive Genetic Algorithm, AGA)同步优选建模特征变量和模型参数的支持向量回归(Support Vector Regression, SVR)算法反演三工河流域2016年SSC,并分析其在不同土地利用类型的分布特征。建模特征变量和模型参数的同步优选及实验设计如下:首先基于Landsat 8 OLI和SRTM高程数据提取7类共40个盐渍化相关因子,经相关分析初步筛选出候选特征变量,分别代入AGA、遗传算法(Genetic Algorithm, GA)和格网搜索算法(Grid Search, GS)同步优选SVR的建模特征变量和模型参数,并建立盐渍化监测模型(AGA-SVR、GA-SVR、GS-SVR)。结果表明:① AGA-SVR精度最优,GA-SVR次之,GS-SVR最差,相较于GS-SVR,AGA-SVR的R2/RMSE提高了44.65%;② 三工河流域非、轻度、中度、重度盐渍地和盐土的面积占比分别为42.83%、11.02%、15.88%、9.22%、21.05%;③ 草地和未利用地主要以非盐渍地和盐土为主,耕地和林地中非盐渍地分布比例均为最大;不同土地利用类型的SSC均值和标准差均呈现未利用地>草地>耕地>林地的规律。本研究的建模特征变量和模型参数的优选方法可在一定程度上提高盐渍化监测的精度。关键词:盐渍化;遗传算法;机器学习;特征优选;参数优化;土壤盐分含量;土地利用;相关分析

收稿时间:2019-09-16

Correlation Analysis and Adaptive Genetic Algorithm based Feature Subset and Model Parameter Optimization in Salinization Monitoring
XU Hongtao,CHEN Chunbo,ZHENG Hongwei,LUO Geping,YANG Liao,WANG Weisheng,WU Shixin.Correlation Analysis and Adaptive Genetic Algorithm based Feature Subset and Model Parameter Optimization in Salinization Monitoring[J].Geo-information Science,2020,22(7):1497-1509.
Authors:XU Hongtao  CHEN Chunbo  ZHENG Hongwei  LUO Geping  YANG Liao  WANG Weisheng  WU Shixin
Institution:1. State Key Laboratory of Desert and Oasis Ecology, Xinjiang Institute of Ecology and Geography, Chinese Academy of Sciences, Urumqi 830011, China;2. University of Chinese Academy of Sciences, Beijing 100049, China
Abstract:The selection of feature subset and the optimization of model parameters plays an important role in improving the accuracy of soil salinization monitoring. However, machine learning algorithm combined with other data such as remote sensing images to predict Soil Salt Content (SSC) pays little attention to the optimization of feature subset and model parameters. In this paper, the Support Vector Regression (SVR) algorithm with synchronous optimized feature subset and model parameters using the Adaptive Genetic Algorithm (AGA) was developed to retrieve the SSC of Sangong River Basin in 2016, and the distributions of SSC in different land use types were analyzed. The synchronous optimization of feature subset and model parameters, and the comparative experimental design were conducted as follows. First, a total of 40 salinization-related factors of 7 categories(Vegetation indices, Salinity indices, Underlying surface Reflection factor, Feature spaces, Tasselled Cap transformation factors, Surface reflectance, Topographic factors) were extracted from Landsat 8 OLI and SRTM Digital Elevation Model(DEM) data, and the Candidate Feature Variables (CFVs) were initially selected by correlation analysis using significance (p<0.05) as standard. Then the CFVs were introduced into AGA, Genetic Algorithm(GA), Grid Search (GS) to synchronous optimize the feature subset and model parameters of SVR, and the different salinization monitoring models (AGA-SVR, GA-SVR, GS-SVR) were established, respectively. The results show that the performance of different salinization monitoring models occurred in the order of AGA-SVR> GA-SVR > GS-SVR. Comparing with GS-SVR, the GA-SVR and AGA-SVR improved the accuracy of salinization monitoring obviously, while the R2/RMSE of AGA-SVR increased by 44.65%. In terms of the different types of salinized soil, the proportion of non-salinized soil, slightly salinized soil, moderately salinized soil, severely salinized soil, saline soil in Sangong River Basin was 42.83%, 11.02%, 15.88%, 9.22%, 21.05%, respectively. In terms of the distribution of SSC in different land use types, the unused land and grassland were mainly comprised of non-salinized soil and saline soil, while the distribution proportion of non-salinized soil were the largest in farmland and forest land. Moreover, the mean and standard deviation of SSC of different land use types were in the order of unused land > grassland >farmland > forest land. To some extent, the preferred method of feature subset selection and model parameters optimization in this paper can improve the accuracy of salinization monitoring.
Keywords:soil salinization  adaptive genetic algorithm  machine learning  feature subset selection  parameter optimization  soil salt content  land use  correlation analysis  
本文献已被 CNKI 等数据库收录!
点击此处可从《地球信息科学》浏览原始摘要信息
点击此处可从《地球信息科学》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号