首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 22 毫秒
1.
Kernel density estimators are useful building blocks for empirical statistical modeling of precipitation and other hydroclimatic variables. Data driven estimates of the marginal probability density function of these variables (which may have discrete or continuous arguments) provide a useful basis for Monte Carlo resampling and are also useful for posing and testing hypotheses (e.g bimodality) as to the frequency distributions of the variable. In this paper, some issues related to the selection and design of univariate kernel density estimators are reviewed. Some strategies for bandwidth and kernel selection are discussed in an applied context and recommendations for parameter selection are offered. This paper complements the nonparametric wet/dry spell resampling methodology presented in Lall et al. (1996).  相似文献   

2.
A new approach for streamflow simulation using nonparametric methods was described in a recent publication (Sharma et al. 1997). Use of nonparametric methods has the advantage that they avoid the issue of selecting a probability distribution and can represent nonlinear features, such as asymmetry and bimodality that hitherto were difficult to represent, in the probability structure of hydrologic variables such as streamflow and precipitation. The nonparametric method used was kernel density estimation, which requires the selection of bandwidth (smoothing) parameters. This study documents some of the tests that were conduced to evaluate the performance of bandwidth estimation methods for kernel density estimation. Issues related to selection of optimal smoothing parameters for kernel density estimation with small samples (200 or fewer data points) are examined. Both reference to a Gaussian density and data based specifications are applied to estimate bandwidths for samples from bivariate normal mixture densities. The three data based methods studied are Maximum Likelihood Cross Validation (MLCV), Least Square Cross Validation (LSCV) and Biased Cross Validation (BCV2). Modifications for estimating optimal local bandwidths using MLCV and LSCV are also examined. We found that the use of local bandwidths does not necessarily improve the density estimate with small samples. Of the global bandwidth estimators compared, we found that MLCV and LSCV are better because they show lower variability and higher accuracy while Biased Cross Validation suffers from multiple optimal bandwidths for samples from strongly bimodal densities. These results, of particular interest in stochastic hydrology where small samples are common, may have importance in other applications of nonparametric density estimation methods with similar sample sizes and distribution shapes. Received: November 12, 1997  相似文献   

3.
A new approach for streamflow simulation using nonparametric methods was described in a recent publication (Sharma et al. 1997). Use of nonparametric methods has the advantage that they avoid the issue of selecting a probability distribution and can represent nonlinear features, such as asymmetry and bimodality that hitherto were difficult to represent, in the probability structure of hydrologic variables such as streamflow and precipitation. The nonparametric method used was kernel density estimation, which requires the selection of bandwidth (smoothing) parameters. This study documents some of the tests that were conduced to evaluate the performance of bandwidth estimation methods for kernel density estimation. Issues related to selection of optimal smoothing parameters for kernel density estimation with small samples (200 or fewer data points) are examined. Both reference to a Gaussian density and data based specifications are applied to estimate bandwidths for samples from bivariate normal mixture densities. The three data based methods studied are Maximum Likelihood Cross Validation (MLCV), Least Square Cross Validation (LSCV) and Biased Cross Validation (BCV2). Modifications for estimating optimal local bandwidths using MLCV and LSCV are also examined. We found that the use of local bandwidths does not necessarily improve the density estimate with small samples. Of the global bandwidth estimators compared, we found that MLCV and LSCV are better because they show lower variability and higher accuracy while Biased Cross Validation suffers from multiple optimal bandwidths for samples from strongly bimodal densities. These results, of particular interest in stochastic hydrology where small samples are common, may have importance in other applications of nonparametric density estimation methods with similar sample sizes and distribution shapes. Received: November 12, 1997  相似文献   

4.
  Mutual information is a generalised measure of dependence between any two variables. It can be used to quantify non-linear as well as linear dependence between any two variables. This makes mutual information an attractive alternative to the use of the correlation coefficient, which can only quantify the linear dependence pattern. Mutual information is especially suited for application to hydrological problems, because the dependence between any two hydrologic variables is seldom linear in nature. Calculation of the mutual information score involves estimation of the marginal and joint probability density functions of the two variables. This paper uses nonparametric kernel density estimation methods to estimate the probability density functions. Accurate estimation of the mutual information score using kernel methods requires selection of appropriate smoothing parameters (bandwidths) for use with the kernels. The aim of this paper is to obtain a practical method for bandwidth selection for calculation of the mutual information score. In this paper, the lag-one dependence structures of several autocorrelated time series are analysed using mutual information (note that this produces the lag-one auto-MI score, the analog of the lag-one autocorrelation). Empirical trials are used to select appropriate bandwidths for a range of underlying autoregressive and autoregressive-moving average models with normal or near-normal parent distributions. Expressions for reasonable bandwidth choices under these conditions are proposed.  相似文献   

5.
A nonparametric resampling technique for generating daily weather variables at a site is presented. The method samples the original data with replacement while smoothing the empirical conditional distribution function. The technique can be thought of as a smoothed conditional Bootstrap and is equivalent to simulation from a kernel density estimate of the multivariate conditional probability density function. This improves on the classical Bootstrap technique by generating values that have not occurred exactly in the original sample and by alleviating the reproduction of fine spurious details in the data. Precipitation is generated from the nonparametric wet/dry spell model as described in Lall et al. [1995]. A vector of other variables (solar radiation, maximum temperature, minimum temperature, average dew point temperature, and average wind speed) is then simulated by conditioning on the vector of these variables on the preceding day and the precipitation amount on the day of interest. An application of the resampling scheme with 30 years of daily weather data at Salt Lake City, Utah, USA, is provided.  相似文献   

6.
Wensheng Wang  Jing Ding 《水文研究》2007,21(13):1764-1771
A p‐order multivariate kernel density model based on kernel density theory has been developed for synthetic generation of multivariate variables. It belongs to a kind of data‐driven approach and is able to avoid prior assumptions as to the form of probability distribution (normal or Pearson III) and the form of dependence (linear or non‐linear). The p‐order multivariate kernel density model is a non‐parametric method for synthesis of streamflow. The model is more flexible than conventional parametric models used in stochastic hydrology. The effectiveness and satisfactoriness of this model are illustrated through its application to the simultaneous synthetic generation of daily streamflow from Pingshan station and Yibin‐Pingshan region (Yi‐Ping region) of the Jinsha River in China. Copyright © 2007 John Wiley & Sons, Ltd.  相似文献   

7.
Inverse problems involving the characterization of hydraulic properties of groundwater flow systems by conditioning on observations of the state variables are mathematically ill-posed because they have multiple solutions and are sensitive to small changes in the data. In the framework of McMC methods for nonlinear optimization and under an iterative spatial resampling transition kernel, we present an algorithm for narrowing the prior and thus producing improved proposal realizations. To achieve this goal, we cosimulate the facies distribution conditionally to facies observations and normal scores transformed hydrologic response measurements, assuming a linear coregionalization model. The approach works by creating an importance sampling effect that steers the process to selected areas of the prior. The effectiveness of our approach is demonstrated by an example application on a synthetic underdetermined inverse problem in aquifer characterization.  相似文献   

8.
Hydrologic risk analysis for dam safety relies on a series of probabilistic analyses of rainfall-runoff and flow routing models, and their associated inputs. This is a complex problem in that the probability distributions of multiple independent and derived random variables need to be estimated in order to evaluate the probability of dam overtopping. Typically, parametric density estimation methods have been applied in this setting, and the exhaustive Monte Carlo simulation (MCS) of models is used to derive some of the distributions. Often, the distributions used to model some of the random variables are inappropriate relative to the expected behaviour of these variables, and as a result, simulations of the system can lead to unrealistic values of extreme rainfall or water surface levels and hence of the probability of dam overtopping. In this paper, three major innovations are introduced to address this situation. The first is the use of nonparametric probability density estimation methods for selected variables, the second is the use of Latin Hypercube sampling to improve the efficiency of MCS driven by the multiple random variables, and the third is the use of Bootstrap resampling to determine initial water surface level. An application to the Soyang Dam in South Korea illustrates how the traditional parametric approach can lead to potentially unrealistic estimates of dam safety, while the proposed approach provides rather reasonable estimates and an assessment of their sensitivity to key parameters.  相似文献   

9.
Spatial prediction and variable selection for the study area are both important issues in geostatistics. If spatially varying means exist among different subareas, globally fitting a spatial regression model for observations over the study area may be not suitable. To alleviate deviations from spatial model assumptions, this paper proposes a methodology to locally select variables for each subarea based on a locally empirical conditional Akaike information criterion. In this situation, the global spatial dependence of observations is considered and the local characteristics of each subarea are also identified. It results in a composite spatial predictor which provides a more accurate spatial prediction for the response variables of interest in terms of the mean squared prediction errors. Further, the corresponding prediction variance is also evaluated based on a resampling method. Statistical inferences of the proposed methodology are justified both theoretically and numerically. Finally, an application of a mercury data set for lakes in Maine, USA is analyzed for illustration.  相似文献   

10.
Critical for an efficient and effective exploitation of a slate mine is to obtain information on its technical quality, in other words, on the exploitability potential of the deposit. We applied support vector machines (SVM) and LS-Boosting to the assessment of the technical quality of a new unexploited area of a mine, and compared the results to those obtained for kriging and neural networks. Firstly we analyzed the relationship between kriging and semi-parametric SVM in a regularization framework and explored the different alternatives for training these networks. Subsequently, in an attempt to combine both radial and projection structures, we formulated a boosting technique for radial basis function (RBF) networks defined over projections in the input space (RBFPP). The application of these techniques to our test drilling data demonstrated a similar level of performance for all the estimators examined, with the main difference occurring in the shape of the respective deposit reconstructions. Therefore, in choosing between the different techniques, an essential aspect will be their ability to reproduce the morphological characteristics of the true process. In this paper we also evaluate the benefits of using the estimated covariogram as the kernel of the SVMs and compare the sparsity of the different solutions. The results obtained show that the selection of a standard kernel that ignores the variability structure of the problem produces poorer results than when the estimated covariogram is used as the kernel.The research of J. Taboada was supported by the European Union, FEDER program, Project 1FD97–0091. The research of W. González-Manteiga was supported by Ministerio de Ciencia y Tecnología of the Spanish Government, Project BFM2002–03213. The authors wish to thank the associate editor and an anonymous referee for stimulating comments.  相似文献   

11.
ABSTRACT

Streamflow prediction is useful for robust water resources engineering and management. This paper introduces a new methodology to generate more effective features for streamflow prediction based on the concept of “interaction effect”. The new features (input variables) are derived from the original features in a process called feature generation. It is necessary to select the most efficient input variables for the modelling process. Two feature selection methods, least absolute shrinkage and selection operator (LASSO) and particle swarm optimization-artificial neural networks (PSO-ANN), are used to select the effective features. Principal components analysis (PCA) is used to reduce the dimensions of selected features. Then, optimized support vector regression (SVR) is used for monthly streamflow prediction at the Karaj River in Iran. The proposed method provided accurate prediction results with a root mean square error (RMSE) of 2.79 m3/s and determination coefficient (R2 ) of 0.92.  相似文献   

12.
1 Introduction The process of remotely sensed data acquisition isaffected by factors such as the rotation of the earth, finite scan rate of some sensors, curvature of the earth, non-ideal sensor, variation in platform altitude, attitude, velocity, etc.[1]. One important procedurewhich should be done prior to analyzing remotely sensed data, is geometric correction (image to map) or registration (image to image) of remotely sensed data. The purpose of geometric correction or registration is to e…  相似文献   

13.
There are two basic approaches for estimating flood quantiles: a parametric and a nonparametric method. In this study, the comparisons of parametric and nonparametric models for annual maximum flood data of Goan gauging station in Korea were performed based on Monte Carlo simulation. In order to consider uncertainties that can arise from model and data errors, kernel density estimation for fitting the sampling distributions was chosen to determine safety factors (SFs) that depend on the probability model used to fit the real data. The relative biases of Sheater and Jones plug-in (SJ) are the smallest in most cases among seven bandwidth selectors applied. The relative root mean square errors (RRMSEs) of the Gumbel (GUM) are smaller than those of any other models regardless of parent models considered. When the Weibull-2 is assumed as a parent model, the RRMSEs of kernel density estimation are relatively small, while those of kernel density estimation are much bigger than those of parametric methods for other parent models. However, the RRMSEs of kernel density estimation within interpolation range are much smaller than those for extrapolation range in comparison with those of parametric methods. Among the applied distributions, the GUM model has the smallest SFs for all parent models, and the general extreme value model has the largest values for all parent models considered.  相似文献   

14.
The stochastic model has been widely used for the simulation study. However, there was a difficulty in the reproduction of the skewness of observed series and so the stochastic model for the skewness preservation was appeared. While the skewness in the residuals of the stochastic model has been considered for the skewness preservation this study uses a random resampling technique of residuals from the stochastic models for the simulation study and for the investigation of the skewness coefficient. The main advantage of this resampling scheme, called the bootstrap method is that it does not rely on the assumption of population distribution and this study uses the combined model of the stochastic and bootstrapped models. The stochastic and bootstrapped stochastic (or combined) models are used for the investigations of skewness preservation and of the reproduction of probability density function between the simulated series. The models are applied to the annual and monthly streamflows of Yongdam site in Korea and Yakima river, Washington, USA for the streamflow simulation study then the statistics and probability density functions for the observed and simulated streamflows are compared. As the results the bootstrapped stochastic model reproduces the skewness and probability density function much better than the stochastic model. This evidences suggest that the bootstrapped stochastic model might be more appropriate than the stochastic model for the preservation of skewness and for simulation purposes of the series.  相似文献   

15.
Conventional method of probabilistic seismic hazard analysis (PSHA) using the Cornell–McGuire approach requires identification of homogeneous source zones as the first step. This criterion brings along many issues and, hence, several alternative methods to hazard estimation have come up in the last few years. Methods such as zoneless or zone-free methods, modelling of earth’s crust using numerical methods with finite element analysis, have been proposed. Delineating a homogeneous source zone in regions of distributed seismicity and/or diffused seismicity is rather a difficult task. In this study, the zone-free method using the adaptive kernel technique to hazard estimation is explored for regions having distributed and diffused seismicity. Chennai city is in such a region with low to moderate seismicity so it has been used as a case study. The adaptive kernel technique is statistically superior to the fixed kernel technique primarily because the bandwidth of the kernel is varied spatially depending on the clustering or sparseness of the epicentres. Although the fixed kernel technique has proven to work well in general density estimation cases, it fails to perform in the case of multimodal and long tail distributions. In such situations, the adaptive kernel technique serves the purpose and is more relevant in earthquake engineering as the activity rate probability density surface is multimodal in nature. The peak ground acceleration (PGA) obtained from all the three approaches (i.e., the Cornell–McGuire approach, fixed kernel and adaptive kernel techniques) for 10% probability of exceedance in 50?years is around 0.087?g. The uniform hazard spectra (UHS) are also provided for different structural periods.  相似文献   

16.
General circulation models (GCMs), the climate models often used in assessing the impact of climate change, operate on a coarse scale and thus the simulation results obtained from GCMs are not particularly useful in a comparatively smaller river basin scale hydrology. The article presents a methodology of statistical downscaling based on sparse Bayesian learning and Relevance Vector Machine (RVM) to model streamflow at river basin scale for monsoon period (June, July, August, September) using GCM simulated climatic variables. NCEP/NCAR reanalysis data have been used for training the model to establish a statistical relationship between streamflow and climatic variables. The relationship thus obtained is used to project the future streamflow from GCM simulations. The statistical methodology involves principal component analysis, fuzzy clustering and RVM. Different kernel functions are used for comparison purpose. The model is applied to Mahanadi river basin in India. The results obtained using RVM are compared with those of state-of-the-art Support Vector Machine (SVM) to present the advantages of RVMs over SVMs. A decreasing trend is observed for monsoon streamflow of Mahanadi due to high surface warming in future, with the CCSR/NIES GCM and B2 scenario.  相似文献   

17.
18.
Stochastic weather generators have evolved as tools for creating long time series of synthetic meteorological data at a site for risk assessments in hydrologic and agricultural applications. Recently, their use has been extended as downscaling tools for climate change impact assessments. Non‐parametric weather generators, which typically use a K‐nearest neighbour (K‐NN) resampling approach, require no statistical assumptions about probability distributions of variables and can be easily applied for multi‐site use. Two characteristics of traditional K‐NN models result from resampling daily values: (1) temporal correlation structure of daily temperatures may be lost, and (2) no values less than or exceeding historical observations can be simulated. Temporal correlation in simulated temperature data is important for hydrologic applications. Temperature is a major driver of many processes within the hydrologic cycle (for example, evaporation, snow melt, etc.) that may affect flood levels. As such, a new methodology for simulation of climate data using the K‐NN approach is presented (named KnnCAD Version 4). A block resampling scheme is introduced along with perturbation of the reshuffled daily temperature data to create 675 years of synthetic historical daily temperatures for the Upper Thames River basin in Ontario, Canada. The updated KnnCAD model is shown to adequately reproduce observed monthly temperature characteristics as well as temporal and spatial correlations while simulating reasonable values which can exceed the range of observations. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

19.
In the introductory part of the paper the importance of the topic for gravity field studies is outlined. Some concepts and tools often used for the representation of the solution of the respective boundary-value problems are mentioned. Subsequently a weak formulation of Neumann??s problem is considered with emphasis on a particular choice of function basis generated by the reproducing kernel of the respective Hilbert space of functions. The paper then focuses on the construction of the reproducing kernel for the solution domain given by the exterior of an oblate ellipsoid of revolution. First its exact structure is derived by means of the apparatus of ellipsoidal harmonics. In this case the structure of the kernel, similarly as of the entries of Galerkin??s matrix, becomes rather complex. Therefore, an approximation of ellipsoidal harmonics (limit layer approach), based on an approximation version of Legendre??s ordinary differential equation, resulting from the method of separation of variables in solving Laplace??s equation, is used. The kernel thus obtained shows some similar features, which the reproducing kernel has in the spherical case, i.e. for the solution domain represented by the exterior of a sphere. A numerical implementation of the exact structure of the reproducing kernel is mentioned as a driving impulse of running investigations.  相似文献   

20.
Earthquakes are one of the most destructive natural disasters and the spatial distribution of their epicentres generally shows diverse interaction structures at different spatial scales. In this paper, we use a multi-scale point pattern model to describe the main seismicity in the Hellenic area over the last 10 years. We analyze the interaction between events and the relationship with geological information of the study area, using hybrid models as proposed by Baddeley et al. (2013). In our analysis, we find two competing suitable hybrid models, one with a full parametric structure and the other one based on nonparametric kernel estimators for the spatial inhomogeneity.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号