首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 78 毫秒
1.
Traditionally,one form of preprocessing in multivariate calibration methods such as principal componentregression and partial least squares is mean centering the independent variables(responses)and thedependent variables(concentrations).However,upon examination of the statistical issue of errorpropagation in multivariate calibration,it was found that mean centering is not advised for some datastructures.In this paper it is shown that for response data which(i)vary linearly with concentration,(ii)have no baseline(when there is a component with a non-zero response that does not change inconcentration)and(iii)have no closure in the concentrations(for each sample the concentrations of allcomponents add to a constant,e.g.100%)it is better not to mean center the calibration data.That is,the prediction errors as evaluated by a root mean square error statistic will be smaller for a model madewith the raw data than a model made with mean-centered data.With simulated data relativeimprovements ranging from 1% to 13% were observed depending on the amount of error in thecalibration concentrations and responses.  相似文献   

2.
This paper utilizes variable step size generalized simulated annealing(VSGSA)to design multicomponentcalibration samples for spectroscopic data.VSGSA is an optimization procedure which is capable ofconverging to exact positions of global optima located on multidimensional continuous functions.On thebasis of analysis sample response vectors,optimally designed calibration concentration matrices areobtained assuming knowledge of components present.The complexity of response surfaces establishedby the optimization criteria is described.  相似文献   

3.
Care is required for multicomponent analysis if misleading results are to be avoided. The problem ofill-conditioned calibration matrices is of primary concern. This type of numerical instability isrepresented as spectral overlap of calibration spectra. Depending on the degree of spectral overlap, thesample concentration estimates can be severely affected. A practical statistical procedure is discussedwhich tests for the presence of spectral overlap among the pure-component spectra and simultaneouslyassesses the degree that concentration estimates may be degraded. Guidelines are developed to ascertainhow much departure from spectral orthogonality is acceptable.  相似文献   

4.
New expressions are derived for the standard errors in the eigenvalues of a cross-product matrix by themethod of error propagation.Cross-product matrices frequently arise in multivariate data analysis,especially in principal component analysis (PCA).The derived standard errors account for the variabilityin the data as a result of measurement noise and are therefore essentially different from the standarderrors developed in multivariate statistics.Those standard errors were derived in order to account for thefinite number of observations on a fixed number of variables,the so-called sampling error.They can beused for making inferences about the population eigenvalues.Making inferences about the populationeigenvalues is often not the purposes of PCA in physical sciences,This is particularly true if themeasurements are performed on an analytical instrument that produces two-dimensional arrays for onechemical sample:the rows and columns of such a data matrix cannot be identified with observations onvariables at all.However,PCA can still be used as a general data reduction technique,but now the effectof measurement noise on the standard errors in the eigenvalues has to be considered.The consequencesfor significance testing of the eigenvalues as well as the usefulness for error estimates for scores andloadings of PCA,multiple linear regression (MLR) and the generalized rank annihilation method(GRAM) are discussed.The adequacy of the derived expressions is tested by Monte Carlo simulations.  相似文献   

5.
Historical GIS has the potential to re‐invigorate our use of statistics from historical censuses and related sources. In particular, areal interpolation can be used to create long‐run time‐series of spatially detailed data that will enable us to enhance significantly our understanding of geographical change over periods of a century or more. The difficulty with areal interpolation, however, is that the data that it generates are estimates which will inevitably contain some error. This paper describes a technique that allows the automated identification of possible errors at the level of the individual data values.  相似文献   

6.
Magnetic susceptibility variations in the Chinese loess/palaeosol sequences have been used extensively for palaeoclimatic interpretations. The magnetic signal of these sequences must be divided into lithogenic and pedogenic components because the palaeoclimatic record is primarily reflected in the pedogenic component. In this paper we compare two methods for separating the pedogenic and lithogenic components of the magnetic susceptibility signal: the citrate-bicarbonate-dithionite (CBD) extraction procedure, and a mixing analysis. Both methods yield good estimates of the pedogenic component, especially for the palaeosols. The CBD procedure underestimates the lithogenic component and overestimates the pedogenic component. The magnitude of this effect is moderately high in loess layers but almost negligible in palaeosols. The mixing model overestimates the lithogenic component and underestimates the pedogenic component. Both methods can be adjusted to yield better estimates of both components. The lithogenic susceptibility, as determined by either method, suggests that palaeoclimatic interpretations based only on total susceptibility will be in error and that a single estimate of the average lithogenic susceptibility is not an accurate basis for adjusting the total susceptibility. A long-term decline in lithogenic susceptibility with depth in the section suggests more intense or prolonged periods of weathering associated with the formation of the older palaeosols.
The CBD procedure provides the most comprehensive information on the magnitude of the components and magnetic mineralogy of loess and palaeosols. However, the mixing analysis provides a sensitive, rapid, and easily applied alternative to the CBD procedure. A combination of the two approaches provides the most powerful and perhaps the most accurate way of separating the magnetic susceptibility components.  相似文献   

7.
Abstract

Kriging is an optimal method of spatial interpolation that produces an error for each interpolated value. Block kriging is a form of kriging that computes averaged estimates over blocks (areas or volumes) within the interpolation space. If this space is sampled sparsely, and divided into blocks of a constant size, a variable estimation error is obtained for each block, with blocks near to sample points having smaller errors than blocks farther away. An alternative strategy for sparsely sampled spaces is to vary the sizes of blocks in such away that a block's interpolated value is just sufficiently different from that of an adjacent block given the errors on both blocks. This has the advantage of increasing spatial resolution in many regions, and conversely reducing it in others where maintaining a constant size of block is unjustified (hence achieving data compression). Such a variable subdivision of space can be achieved by regular recursive decomposition using a hierarchical data structure. An implementation of this alternative strategy employing a split-and-merge algorithm operating on a hierarchical data structure is discussed. The technique is illustrated using an oceanographic example involving the interpolation of satellite sea surface temperature data. Consideration is given to the problem of error propagation when combining variable resolution interpolated fields in GIS modelling operations.  相似文献   

8.
The increasing use of Geographical Information System applications has generated a strong interest in the assessment of data quality. As an example of quantitative raster data, we analysed errors in Digital Terrain Models (DTM). Errors might be classified as systematic (strongly dependent on the production methodology) and random. The present work attempts to locate some types of randomly distributed, weakly spatially correlated errors by applying a new methodology based on Principal Components Analysis. The Principal Components approach presented is very different from the typical scheme used in image processing. A prototype implementation has been conducted using MATLAB, and the overall procedure has been numerically tested using a Monte Carlo approach. A DTM of Stockholm, with integer-valued heights varying from 0 to 59 m has been used as a testbed.The model was contaminated by adding randomly located errors, distributed uniformly within 4 m and 4m. The procedure has been applied using both spike shaped (isolated errors) and pyramid-like errors. The preliminary results show that for the former, roughly half of the errors have been located with a Type I error probability of 4.6 per cent on average, checking up to 1 per cent of the dataset. The associated Type II error of the larger errors (of exactly 4m or 4 m) drops from an initial value of 1.21 per cent down to 0.63 per cent. By checking another 1 per cent of the dataset, such error drops to 0.34 per cent implying that about 71 per cent of the 4m errors have been located; Type I error was below 11.27 per cent. The results for pyramid-like errors are slightly worse, with a Type I error of 25.80 per cent on average for the first 1 per cent effort, and a Type II error drop from an initial value of 0.81 per cent down to 0.65 per cent. The procedure can be applied both for error detection during the DTM generation and by end users. It might also be used for other types of quantitative raster data.  相似文献   

9.
A modification of a technique proposed by Lorber and Kowalski for the estimation of prediction errorsis presented.The method is applied to five data sets.The results show that for some data sets theestimated prediction errors are close to the actual prediction errors for samples within the calibrationrange,while samples outside the calibration range must be background corrected before quantificationof the prediction error.  相似文献   

10.
RECENT DEVELOPMENTS IN MULTIVARIATE CALIBRATION   总被引:1,自引:0,他引:1  
With the goal of understanding global chemical processes,environmental chemists have some of the mostcomplex sample analysis problems.Multivariate calibration is a tool that can be applied successfully inmany situations where traditional univariate analyses cannot.The purpose of this paper is to reviewmultivariate calibration,with an emphasis being placed on the developments in recent years.The inverseand classical models are discussed briefly,with the main emphasis on the biased calibration methods.Principal component regression(PCR)and partial least squares(PLS)are discussed,along with methodsfor quantitative and qualitative validation of the calibration models.Non-linear PCR,non-linear PLSand locally weighted regression are presented as calibration methods for non-linear data.Finally,calibration techniques using a matrix of data per sample(second-order calibration)are discussed briefly.  相似文献   

11.
Mineral deposit grades are usually estimated using data from samples of rock cores extracted from drill holes. Commonly, mineral deposit grade estimates are required for each block to be mined. Every estimated grade has always a corresponding error when compared against real grades of blocks. The error depends on various factors, among which the most important is the number of correlated samples used for estimation. Samples may be collected on a regular sampling grid and, as the spacing between samples decreases, the error of grade estimated from the data generally decreases. Sampling can be expensive. The maximum distance between samples that provides an acceptable error of grade estimate is useful for deciding how many samples are adequate. The error also depends on the geometry of a block, as lower errors would be expected when estimating the grade of large-volume blocks, and on the variability of the data within the region of the blocks. Local variability is measured in this study using the coefficient of variation (CV). We show charts analyzing error in block grade estimates as a function of sampling grid (obtained by geostatistical simulation), for various block dimensions (volumes) and for a given CV interval. These charts show results for two different attributes (Au and Ni) of two different deposits. The results show that similar errors were found for the two deposits, although they share similar features: sampling grid, block volume, CV, and continuity model. Consequently, the error for other attributes with similar features could be obtained from a single chart.  相似文献   

12.
ealibrared.The diseussion in this PaPer foeuses on near一infrared(NIR)sPeetroseoPy as the examPle instrument.However,the Proeedures Presented are aPPlieable tomost methods of instrumental analysis.Essentially,ealibration eonsists of assembling a seriesof samPles eontaining the analyte or analytes at  相似文献   

13.
基于Kriging的地形高程插值   总被引:6,自引:0,他引:6  
将地形高程作为区域化变量,根据普通Kriging法由散乱的高程点进行地形高程插值,并采用Matlab软件开发专门的程序,实现研究区高程插值计算与结果可视化分析。以广州市南沙区10 km2范围内的200个高程点数据为例,分别运用球面模型、指数模型和高斯理论变差函数模型进行10 m×10 m格网插值,借助Matlab可视化分析插值结果及其精度,表明采用指数模型效果最好。  相似文献   

14.
薄板光顺样条插值与中国气候空间模拟   总被引:27,自引:0,他引:27  
阎洪 《地理科学》2004,24(2):163-169
利用720个气象台网的长期平均气象数据拟合具有三维地理空间的气候曲面,并与1km空间分辨率的数字高程模型相结合,对气候变量的规则栅格进行插值估计。对各月平均最低温度、平均最高温度和降水量的插值结果构成了基础数字气候空间,以满足地理信息系统的数据分析需求。插值过程提供的误差统计表明插值的温度误差普遍小于0.6度,降水误差范围在8%~15%,明显优于其它插值方法。样条法利用线性模型反映地形对气候的影响,并提供了简便的误差诊断程序,具有良好的实用性。  相似文献   

15.
Abstract

Results of a simulation study of map-image rectification accuracy are reported. Sample size, spatial distribution pattern and measurement errors in a set of ground control points, and the computational algorithm employed to derive the estimate of the parameters of a least-squares bivariate map-image transformation function, are varied in order to assess the sensitivity of the procedure. Standard errors and confidence limits are derived for each of 72 cases, and it is shown that the effects of all four factors are significant. Standard errors fall rapidly as sample size increases, and rise as the control point pattern becomes more linear. Measurement error is shown to have a significant effect on both accuracy and precision. The Gram-Schmidt orthogonal polynomial algorithm performs consistently better than the Gauss-Jordan matrix inversion procedure in all circumstances.  相似文献   

16.
When using hyphenated methods in analytical chemistry,the data obtained for each sample are given asa matrix.When a regression equation is set up between an unknown sample (a matrix) and a calibrationset (a stack of matrices),the residual is a matrix R.The regression equation is usually solved by minimizing the sum of squares of R.If the sample containssome constituent not calibrated for,this approach is not valid.In this paper an algorithm is presentedwhich partitions R into one matrix of low rank corresponding to the unknown constituents,and onerandom noise matrix to which the least squares restrictions are applied.Properties and possibleapplications of the algorithm are also discussed.In Part 2 of this work an example from HPLC with diode array detection is presented and the resultsare compared with generalized rank annihilation factor analysis (GRAFA).  相似文献   

17.
The usefulness of the Kalman filter as an algorithm for calibration in a real system is shown. Results arecompared with classical least squares and pure component calibration. The prediction of four prioritypollutant chlorophenols in binary, ternary and quaternary mixtures was also carried out by Kalmanfiltering. The condition number, standard deviation and prediction error have been employed to choosethe most suitable wavelength range. Comparison of the standard error of prediction in the validation setshows significant differences between the evaluated chlorophenols, the best results being obtained withKalman multivariate calibration.  相似文献   

18.
Areal interpolation is the process by which data collected from one set of zonal units can be estimated for another zonal division of the same space that shares few or no boundaries with the first. In previous research, we outlined the use of dasymetric mapping for areal interpolation and showed it to be the most accurate method tested. There we used control information derived from classified satellite imagery to parameterize the dasymetric method, but because such data are rife with errors, here we extend the work to examine the sensitivity of the population estimates to error in the classified imagery. Results show the population estimates by dasymetric mapping to be largely insensitive to the errors of classification in the Landsat image when compared with the other methods tested. The dasymetric method deteriorates to the accuracy of the next worst estimate only when 40% error occurs in the classified image, a level of error that may easily be bettered within most remote sensing projects.  相似文献   

19.
A locational error model for spatial features in vector-based geographical information systems (GIS) is proposed in this paper. Using error in points as the fundamental building block, a stochastic model is constructed to analyse point, line, and polygon errors within a unified framework, a departure from current practices which treat errors in point and line separately. The proposed model gives, as a special case, the epsilon band model a true probabilistic meaning. Moreover, the model can also be employed to derive accuracy standards and cartographic estimates in GIS.  相似文献   

20.
Artificial Intelligence (AI) models such as Artificial Neural Networks (ANNs), Decision Trees and Dempster–Shafer's Theory of Evidence have long claimed to be more error‐tolerant than conventional statistical models, but the way error is propagated through these models is unclear. Two sources of error have been identified in this study: sampling error and attribute error. The results show that these errors propagate differently through the three AI models. The Decision Tree was the most affected by error, the Artificial Neural Network was less affected by error, and the Theory of Evidence model was not affected by the errors at all. The study indicates that AI models have very different modes of handling errors. In this case, the machine‐learning models, including ANNs and Decision Trees, are more sensitive to input errors. Dempster–Shafer's Theory of Evidence has demonstrated better potential in dealing with input errors when multisource data sets are involved. The study suggests a strategy of combining AI models to improve classification accuracy. Several combination approaches have been applied, based on a ‘majority voting system’, a simple average, Dempster–Shafer's Theory of Evidence, and fuzzy‐set theory. These approaches all increased classification accuracy to some extent. Two of them also demonstrated good performance in handling input errors. Second‐stage combination approaches which use statistical evaluation of the initial combinations are able to further improve classification results. One of these second‐stage combination approaches increased the overall classification accuracy on forest types to 54% from the original 46.5% of the Decision Tree model, and its visual appearance is also much closer to the ground data. By combining models, it becomes possible to calculate quantitative confidence measurements for the classification results, which can then serve as a better error representation. Final classification products include not only the predicted hard classes for individual cells, but also estimates of the probability and the confidence measurements of the prediction.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号