首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Regularized discriminant analysis has proven to be a most effective classifier for problems wheretraditional classifiers fail because of a lack of sufficient training samples,as is often the case in high-dimensional settings.However,it has been shown that the model selection procedure of regularizeddiscriminant analysis,determining the degree of regularization,has some deficiencies associated with it.We propose a modified model selection procedure based on a new appreciation function.By means ofan extensive simulation it was shown that the new model selection procedure performs better than theoriginal one.We also propose that one of the control parameters of regularized discriminant analysis beallowed to take on negative values.This extension leads to an improved performance in certain situations.The results are confirmed using two chemical data sets.  相似文献   

2.
An application of polynomial curve fitting to sediment cumulative frequency distributions is presented to delineate the foreshore depositional patterns along the barrier beaches of the Rhode Island southshore. The analysis is based on 92 sampled stations where data for beach geometry, tidal stage, and sediment size were collected. Using the size-frequency classes obtained from sieving the foreshore sediment samples at 0.25 O intervals and fitting third-degree polynomial equations to these data, over 94% of the variation within the sediment cumulative frequency distributions is explained. The four curve coefficients (a, b, c, d) derived from the predicted third-degree equation are used in a discriminant function analysis to test the relationship between the curve shape and sediment source. Comparison of the discriminant scores with the respective station locations suggests that a series of Pleistocene headlands which occur as discrete points along the beach are serving as independent sources of sediment for the system.  相似文献   

3.
WHICH PRINCIPAL COMPONENTS TO UTILIZE FOR PRINCIPAL COMPONENT REGRESSION   总被引:1,自引:0,他引:1  
Principal components(PCs)for principal component regression(PCR)have historically been selectedfrom the top down for a reliable predictive model.That is,the PCs are arranged in a list starting withthe most informative(PC associated with the largest singular value)and proceeding to the leastinformative(PC associated with the smallest singular value).PCs are then chosen starting at the top ofthis list.This paper discusses an alternative procedure of treating PC selection as an optimization prob-lem.Specifically,without any regard to the ordering,the optimal subset of PCs for an acceptablepredictive model is desired.Five data sets are analyzed using the conventional and alternative approaches.Two data sets are spectroscopic in nature,two data sets deal with quantitative structure-activityrelationships(QSARs)and one data set is concerned with modeling.All five data sets confirm thatselection of a subset without consideration to order secures the best results with PCR.One data set isalso compared using partial least squares 1.  相似文献   

4.
A modification of a technique proposed by Lorber and Kowalski for the estimation of prediction errorsis presented.The method is applied to five data sets.The results show that for some data sets theestimated prediction errors are close to the actual prediction errors for samples within the calibrationrange,while samples outside the calibration range must be background corrected before quantificationof the prediction error.  相似文献   

5.
Cross-validatory estimation of the bilinear model based on principal components is reviewed andKrzanowski's modification of Wold's procedure is described. Two different types of residuals useful forchecking model adequacy are defined and indices measuring the influence of each observed unit on theestimates of the parameters are discussed. A method for the selection of variables derived from Procrustesanalysis is described. Results arising from the study of two sets of enological data are given.  相似文献   

6.
Calibrations to predict crude protein (CP) and in vitro dry matter digestibility (IVDMD) in dried grasssilage from reflectance data collected at 19 wavelengths on an InfraAlyzer 400R have been developedusing stepwise multiple linear (SML) and principal component (PC) regression techniques. A directcomparison of the efficacy of each multivariate technique in this application has been possible by usingidentical calibration development and evaluation sample sets. The effect of two data transformation stepsprior to PC regression was also investigated. PC regression of raw reflectance data yielded no significantimprovement in the standard errors of prediction (SEP) for CP and IVDMD over those obtained bySMLR, viz. 0.61 vs 0.63 and 2.9 vs 3.0 respectively. Computation time for development and evaluation ofthe PC regression equation was less than for selection of the best SMLR equation, and PCR equationsmay be more robust. Data transformation to reduce granularity effects prior to PCR did not produce anyimprovement in predictive accuracy for either IVDMD or CP.  相似文献   

7.
Evaluation of the results of factor analysis of sets of spectroscopically detected chromatograms is carriedout by examining the shapes of the abstract factors.This is done either by visual inspection or by analysisof the power density spectra produced from them.Owing to constraints imposed by the column functionand the spectroscopic instrument function,the information content of the chromatograms necessarilyoccurs at low spatial frequencies.As a consequence,it appears as relatively broad features in the abstractchromatograms and as a peak in the low-frequency region of the corresponding power density plot.Onthe basis of examination of the power density distribution,a well-defined distinction is made betweenprimary and secondary abstract factors.The major uncertainty encountered in determining the numberof chemical components appears to arise from effects of contaminants in reagents.  相似文献   

8.
9.
Many of the data sets analyzed by physical geographers are compositional in nature: they have row vectors that add to one (or 100%). These unit-sum constrained data sets should not be analyzed by standard multivariate statistical methods. Significant differences were found in the log-ratio mean vectors of the hydraulic exponents (which are unit-sum constrained) for two classes of streams: those with cohesive, non-vertical banks, and those with one firm and one loose bank. Compositional discriminant function analysis of bank stability on the basis of hydraulic geometry had a success rate of 88%, making routinely archived measurements of stream width, cross-sectional area, mean velocity, and discharge a readily available data base for predicting the stability of stream reaches. [Key words: geomorphology, hydraulic geometry, discriminant function, statistics.]  相似文献   

10.
A spherical harmonic degrees 60, global internal field model is described (called BGS/G/L/0706). This model includes a degree 15 core and piecewise-linear secular variation model and is derived from quiet-time Ørsted and Champ satellite data sampled between 2001.0 and 2005.0. For the satellite data selection, a wide range of geomagnetic index and other data selection filters have been used to best isolate suitably quiet magnetospheric and ionospheric conditions. Only a relatively simple, degree one spherical harmonic, external field model is then required. It is found that a new 'Vector Magnetic Disturbance' index ( VMD ), the existing longitude sector A indices, the auroral zone index IE , and the polar cap index PC are better than Kp and Dst at rejecting rapidly varying external field signals at low, middle, auroral and polar latitudes. The model quality is further enhanced by filling spatial and temporal gaps in the quiet data selection with a second selection containing slightly more disturbed data. It is shown that VMD provides a better parametrization than Dst of the large-scale, rapidly changing, external field. The lithospheric field model between degrees 16 and 50 is robust and displays good coherence with other recently published models for this epoch. BGS/G/L/0706 also shows crustal anomalies consistent with other studies, although agreement is poorer in the southern polar cap. Intermodel coherency reduces above about degree 40, most likely due to incompletely filtered signals from polar ionospheric currents and auroral field aligned currents. The absence of the PC index for the southern hemisphere for 2003 onwards is a particular concern.  相似文献   

11.
ealibrared.The diseussion in this PaPer foeuses on near一infrared(NIR)sPeetroseoPy as the examPle instrument.However,the Proeedures Presented are aPPlieable tomost methods of instrumental analysis.Essentially,ealibration eonsists of assembling a seriesof samPles eontaining the analyte or analytes at  相似文献   

12.
基于神经网络的单元自动机CA及真实和优化的城市模拟   总被引:78,自引:8,他引:78  
黎夏  叶嘉安 《地理学报》2002,57(2):159-166
提出了一种基于神经网络的单元自动机(CA)。CA已被越来越多地应用在城市及其它地理现象的模拟中。CA模拟所碰到的最大问题是如何确定模型的结构和参数。模拟真实的城市涉及到使用许多空间变量和参数。当模型较复杂时,很难确定模型的参数值。本模型的结构较简单,模型的参数能通过对神经网络的训练来自动获取。分析表明,所提出的方法能获得更高的模拟精度,并能大大缩短寻找参数所需要的时间。通过筛选训练数据,本模型还可以进行优化的城市模拟,为城市规划提供参考依据。  相似文献   

13.
In this paper we redefine the term detection limit to embrace the inherent multivariate nature of samples,instrumental measurements and chemometrics resolution procedures. The so-called zero-componentregions, i.e. parts with no chemical components eluting, are used as repeated analytical blanks to estimatea statistical multivariate detection limit for determining the number of chemical species in local regionsof a single two-way chromatogram or a collection of synchronized one-way chromatograms. For two-waychromatography the detection limit is determined from the distribution of the first eigenvalues obtainedfrom all possible combinations of spectra in the zero-component regions. The number of spectra in eachcalculation should correspond to the number included in the later examination of the local retention timeregions. For one-way chromatography on a collection of samples with similar chemical components atvarying concentrations the same procedure is used, with the samples taking the role of the spectra intwo-way chromatography. The detection limit can be chosen at various confidence levels depending onwhether false positive or negative detection of minor components is most critical. The results obtainedfrom the zero-eigenvalue distribution are more robust than those obtained by a previously developedF-test.  相似文献   

14.
Qualitative knowledge representation of spatial locations and relations is popular in many text-based media, for example, postings on social networks, news reports, and encyclopedia, as representing qualitative spatial locations is indispensable to infer spatial knowledge from them. However, an integrative model capable of handling direction-based locations of various spatial objects is missing. This study presents an integrative representation and inference framework about direction-based qualitative locations for points, lines, and polygons. In the framework, direction partitions of different types of reference objects are first unified to create a partition consisting of cells, segments, and corners. They serve as a frame of reference to locate spatial objects (e.g., points, lines, and polygons). Qualitative relations are then defined to relate spatial objects to the elements in a cell partition, and to form the model of qualitative locations. Last, based on the integrative representation, location-based reasoning mechanism is presented to derive topological relations between objects from their locations, such as point–point, line–line, point–line, point–polygon, line–polygon, and polygon–polygon relations. The presented model can locate any type of spatial objects in a frame of reference composed of points, lines, and polygons, and derive topological relations between any pairs of objects from the locations in a unified method.  相似文献   

15.
Classification and regression techniques are among the most used tools by chemometricians.Withclassification,the two classic methods are discriminant analysis and SIMCA.In this paper we discuss theconnection between these two methods and introduce two new ones of the same family:DASCO(discriminantanalysis with shrunken covariances)and RDA(regularized discriminant analysis).We demonstrate on bothsimulated and real data sets that their performance is superior to the old favorites.This is especially truein small-sample/high-dimension settings typical in chemistry.  相似文献   

16.
Abstract

This paper describes an inductive modelling procedure integrated with a geographical information system for analysis of pattern within spatial data. The aim of the modelling procedure is to predict the distribution within one data set by combining a number of other data sets. Data set combination is carried out using Bayes’ theorem. Inputs to the theorem, in the form of conditional probabilities, are derived from an inductive learning process in which attributes of the data set to be modelled are compared with attributes of a variety of predictor data sets. This process is carried out on random subsets of the data to generate error bounds on inputs for analysis of error propagation associated with the use of Bayes’ theorem to combine data sets in the GIS. The statistical significance of model inputs is calculated as part of the inductive learning process. Use of the modelling procedure is illustrated through the analysis of the winter habitat relationships of red deer in Grampian Region, north-east Scotland. The distribution of red deer in Deer Management Group areas in Gordon and in Kincardine and Deeside Districts is used to develop a model which predicts the distribution throughout Grampian Region; this is tested against red deer distribution in Moray District. Habitat data sets used for constructing the model are accumulated frost and altitude, obtained from maps, and land cover, derived from satellite imagery. Errors resulting from the use of Bayes’ theorem to combine data sets within the GIS and introduced in generalizing output from 50 m pixel to 1 km grid squares resolution are analysed and presented in a series of maps. This analysis of error trains is an integral part of the implemented analytical procedure and provides support to the interpretation of the results of modelling. Potential applications of the modelling procedure are discussed.  相似文献   

17.
A new statistical approach to the alignment of time series   总被引:1,自引:0,他引:1  
Summary. Much research in the Earth Sciences is centred on the search for similarities in waveforms or amongst sets of observations. For example, in seismology and palaeomagnetism, this matching of records is used to align several series of observations against one another or to compare one set of observations against a master series. This paper gives a general mathematical and statistical formulation of the problem of transforming, linearly or otherwise, the time-scale or depth-scale of one series of data relative to another. Existing approaches to this problem, involving visual matching or the use of correlation coefficients, are shown to have several serious deficiencies, and a new statistical procedure, using least-squares cubic splines, is presented. The new method provides not only a best estimate of the 'stretching function' defining the relative alignment of the two series of observations, but also a statement, by means of confidence regions, of the precision of this transformation. The new procedure is illustrated by analyses of artificially generated data and of palaeomagnetic observations from two cores from Lake Vuokonjarvi, Finland. It may be applied in a wide variety of situations, wherever the observations satisfy the general underlying mathematical model.  相似文献   

18.
SEXIA is an expert system that uses a new methodological approach to identify foods,particularly oliveoils according to varieties,olive zones and denominations of origin.The methodological approachprovides identification tools,associating a confidence degree or a belief interval to the final hypotheses.The certainty factor and the Dempster-Shaffer theory,with some modifications,have been implementedin SEXIA.The computer can work with 50 chemical parameters whose data have previously beenacquired by the food analyst via a dialogue in the Spanish language.The system has been verified with144 olive oil samples.In this paper some results obtained for distinguishing the Arbequina variety fromother varieties using SEXIA and the BMDP stepwise discriminant analysis program are presented.Finally,promising directions for future research are suggested.  相似文献   

19.
Geographically weighted spatial statistical methods are a family of spatial statistical methods developed to address the presence of non-stationarity in geographical processes, the so-called spatial heterogeneity. While these methods have recently become popular for analysis of spatial data, one of their characteristics is that they produce outputs that in themselves form complex multi-dimensional spatial data sets. Interpretation of these outputs is therefore not easy, but is of high importance, since spatial and non-spatial patterns in the results of these methods contain clues to causes of underlying non-stationarity. In this article, we focus on one of the geographically weighted methods, the geographically weighted discriminant analysis (GWDA), which is a method for prediction and analysis of categorical spatial data. It is an extension of linear discriminant analysis (LDA) that allows the relationship between the predictor variables and the categories to vary spatially. This produces a very complex data set of GWDA results, which include on top of the already complex discriminant analysis outputs (e.g. classifications and posterior probabilities) also spatially varying outputs (e.g. classification function parameters). In this article, we suggest using geovisual analytics to visualise results from LDA and GWDA to facilitate comparison between the global and local method results. For this, we develop a bespoke visual methodology that allows us to examine the performance of global and local classification method in terms of quality of classification. Furthermore, we are also interested in identifying the presence (or absence) of non-stationarity through comparison of the outputs of both methods. We do this in two ways. First, we visually explore spatial autocorrelation in both LDA and GWDA misclassifications. Second, we focus on relationships between the classification result and the independent variables and how they vary over space. We describe our visual analytic system for exploration of LDA and GWDA outputs and demonstrate our approach on a case study using a data set linking election results with a selection of socio-economic variables.  相似文献   

20.
Optically stimulated luminescence (OSL) dating for polymineral fine-grained loess samples, collected in Laoguantai (LGT) section on the south of the Chinese Loess Plateau, was made by application of single-aliquot regenerative-dose (SAR) protocol. A ‘Double-SAR’ procedure in which aliquots are subjected to both infrared (IR) and blue stimulations was used, and two sets of equivalent dose (De) determinations were produced and assumed to relate predominantly to feldspathic and quartz fine grain populations respectively. The OSL ages estimated from IRSL signals are smaller than those estimated from [post-IR] OSL signals due to the anomalous fading of feldspar IR signals, based on fading experiment. The young ages of the samples near ground surface may be originated from the post-depositional disturbance by the intensifying humanity’s cultivation since 3.0 ka BP in the Guanzhong Basin, south of the Chinese Loess Plateau. Based on OSL dating, as well as field observations and stratigraphic correlation, we determine the chronology of the LGT loess-paleosol sequence. In combination with climate proxy records, it is indicated that aeolian loess deposition and pedogenesis underwent polyphase changes during the Holocene, likely to have been driven by shifts in the East Asian monsoon. This suggests that aeolian loess deposition is episodic and highly variable, with contributions from non-aeolian processes such as alluvial deposition found in the area.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号