Statistics for sample splitting for the calibration and validation of hydrological models期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

Statistics for sample splitting for the calibration and validation of hydrological models

Authors:	Email author" target="_blank">Dedi?Liu Email author Shenglian?Guo Zhaoli?Wang Pan?Liu Xixuan?Yu Qin?Zhao Hui?Zou

Institution:	1.State Key Laboratory of Water Resources and Hydropower Engineering Science,Wuhan University,Wuhan,China;2.The State Key Laboratory of Subtropical Building Science,South China University of Technology,Guangzhou,China;3.Agricultural and Environmental Sciences,McGill University,Ste. Anne de Bellevue,Canada

Abstract:	Hydrological models have been widely applied in flood forecasting, water resource management and other environmental sciences. Most hydrological models calibrate and validate parameters with available records. However, the first step of hydrological simulation is always to quantitatively and objectively split samples for use in calibration and validation. In this paper, we have proposed a framework to address this issue through a combination of a hierarchical scheme through trial and error method, for systematic testing of hydrological models, and hypothesis testing to check the statistical significance of goodness-of-fit indices. That is, the framework evaluates the performance of a hydrological model using sample splitting for calibration and validation, and assesses the statistical significance of the Nash–Sutcliffe efficiency index (E_f), which is commonly used to assess the performance of hydrological models. The sample splitting scheme used is judged as acceptable if the E_f values exceed the threshold of hypothesis testing. According to the requirements of the hierarchical scheme for systematic testing of hydrological models, cross calibration and validation will help to increase the reliability of the splitting scheme, and reduce the effective range of sample sizes for both calibration and validation. It is illustrated that the threshold of E_f is dependent on the significance level, evaluation criteria (both regarded as the population), distribution type, and sample size. The performance rating of E_f is largely dependent on the evaluation criteria. Three types of distributions, which are based on an approximately standard normal distribution, a Chi square distribution, and a bootstrap method, are used to investigate their effects on the thresholds, with two commonly used significance levels. The highest threshold is from the bootstrap method, the middle one is from the approximately standard normal distribution, and the lowest is from the Chi square distribution. It was found that the smaller the sample size, the higher the threshold values are. Sample splitting was improved by providing more records. In addition, outliers with a large bias between the simulation and the observation can affect the sample values of E_f, and hence the output of the sample splitting scheme. Physical hydrology processes and the purpose of the model should be carefully considered when assessing outliers. The proposed framework in this paper cannot guarantee the best splitting scheme, but the results show the necessary conditions for splitting schemes to calibrate and validate hydrological models from a statistical point of view.

Keywords:
本文献已被 SpringerLink 等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏