首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Compositional data arise naturally in several branches of science,including chemistry,geology,biology,medicine,ecology and manufacturing design.In chemistry,these constrained data seem to occur typicallywhen raw data are normalized or when output is obtained from a constrained estimation procedure,suchas might be used in a source apportionment problem.It is important not only for chemists to be awarethat the usual multivariate statistical techniques are not applicable to constrained data,but also to haveaccess to appropriate techniques as they become available.The currently available methodology is dueprincipally to Aitchison and is based on log-normal models.This paper suggests new parametric andnon-parametric approaches to significantly improve the existing methodology.In the parametric setting,some recent work of Rayens and Srinivasan is extended and a practical regression model is proposed.In the development of the non-parametric approach,minimum distance methods coupled withmultivariate bootstrap techniques are used to obtain point and region estimators.  相似文献   

2.
Predictive pH models developed using scaled chrysophytes (Synurophyceae, Chrysophyceae) have thus far been based on the relative abundance of scales and not whole cells. This paper examines the effects of transforming scale to cell numbers on the predictive abilities of pH inference models, and the effects of logarithmic and square-root transformations of the species data on the predictive abilities of pH inference models.Very similar pH inference models were developed based on either the relative abundance of scales or cells. Thus, in this data-set, there appears to be no statistical advantage in transforming raw scale counts to cell counts prior to calculating the relative abundances. However, if one wishes to compare paleochrysophyte populations to actual long-term limnological chrysophyte collections, a scale-to-cell transformation would be desirable. Logarithmic and square-root transformations of the species data improve the pH inference models. These transformations increase the effective number of occurrences of chrysophyte taxa when compared to the untransformed scale and cell pH models. The logarithmic and square-root transformations improve the pH inference models because the dominant taxa, which are often pH generalists, are down-weighted in comparison to the more pH specialist, sub-dominant taxa. We suggest researchers use either a logarithmic or square-root transformation on chrysophyte scale data to improve quantitative reconstructions of lakewater pH and possibly other variables.  相似文献   

3.

Prediction of true classes of surficial and deep earth materials using multivariate spatial data is a common challenge for geoscience modelers. Most geological processes leave a footprint that can be explored by geochemical data analysis. These footprints are normally complex statistical and spatial patterns buried deep in the high-dimensional compositional space. This paper proposes a spatial predictive model for classification of surficial and deep earth materials derived from the geochemical composition of surface regolith. The model is based on a combination of geostatistical simulation and machine learning approaches. A random forest predictive model is trained, and features are ranked based on their contribution to the predictive model. To generate potential and uncertainty maps, compositional data are simulated at unsampled locations via a chain of transformations (isometric log-ratio transformation followed by the flow anamorphosis) and geostatistical simulation. The simulated results are subsequently back-transformed to the original compositional space. The trained predictive model is used to estimate the probability of classes for simulated compositions. The proposed approach is illustrated through two case studies. In the first case study, the major crustal blocks of the Australian continent are predicted from the surface regolith geochemistry of the National Geochemical Survey of Australia project. The aim of the second case study is to discover the superficial deposits (peat) from the regional-scale soil geochemical data of the Tellus Project. The accuracy of the results in these two case studies confirms the usefulness of the proposed method for geological class prediction and geological process discovery.

  相似文献   

4.
《Urban geography》2013,34(6):515-529
The present paper demonstrates that Kelly's (1955) method of hand factor analyzing the data matrices derived from repertory grids can be employed as a general method of multivariate analysis in geography. This brings the advantages of a non-parametric and noncomputer dependent approach to areas such as factorial ecology, classifications of towns and cities, and urban behavioral analyses, where multivariate techniques have customarily been employed. The method is initially explained by recourse to a simple hypothetical urban retailing data set. Subsequently, more complex real world examples involving multivariate analyses of housing data for Barbados, West Indies, and urban consumers' cognitions of a single store are presented. It is shown that the nonparametric method gives results that are virtually identical to those obtained from traditional computer-based factor analyses. Throughout the paper, the pedagogic and practical virtues of the nonparametric method are considered.  相似文献   

5.
The geometric properties of three common object-preprocessing transformations(constant sum,orclosure;constant length,or normalization;and maximum value,or ratioing)are investigated.Anargument is made for using absolute values in the constant sum and maximum value transformations.In general,each transformation distorts the shape and dimensionality of patterns in the data:transformed data lie on(C-l)-dimensional surfaces in the original C-dimensional space.A data set thathas been closed by one of these transformations can be reopened if a vector containing the constant sums,constant lengths or maximum values of the original objects was retained.Transformed data sets may befreely interconverted among these three transformations without the loss of information.  相似文献   

6.
基于高光谱数据的戈壁地表砾石粒径反演研究   总被引:1,自引:1,他引:0  
戈壁地表砾石粒径组成特征反映戈壁形成过程信息,且在很大程度上决定戈壁改造利用的难易,是开展戈壁研究的基础和前提。结合高光谱数据的微分变换,遴选出砾石粒径的敏感波段与反演方程,进行戈壁地表砾石粒径反演研究。结果表明:微分变换后的砾石光谱反射率与粒径有较好相关性,相关性最好的波段为908nm、983nm和985nm。其中,对数倒数微分变换之后的反射率与粒径成正相关(R2 =0.61),而一阶微分、平方根微分、对数微分3种变换形式之后的反射率与粒径呈负相关,相关系数分别为-0.633、-0.646、-0.649。将一阶微分变换后的光谱数据与粒径进行回归分析,发现一元三次回归模型具有较好的拟合精度,其中对数微分在回归分析中表现最好(R2 =0.851),经过验证得出对数微分预测精度(75.27%)高于其他4种微分形式的精度,表明砾石光谱的对数微分变换之后的908nm波段可应用于戈壁地表砾石粒径的反演。  相似文献   

7.
A set of coordinate transformations is used to linearize a general geophysical inverse problem. Statistical and analytic techniques are employed to estimate the parameters of such linearization transformations. In the transformed space, techniques from linear inverse theory may be utilized. Consequently, important concepts, such as model parameter covariance, model parameter resolution and averaging kernels, may be carried over to non-linear inverse problems. I apply the approach to a set of seismic cross-borehole traveltimes gathered at the Conoco Borehole Test Facility. the seismic survey was conducted within the Fort Riley formation, a limestone with thin interbedded shales. Between the boreholes, the velocity structure of the Fort Riley formation consists of a high-velocity region overlying a section of lower velocity. It is found that model parameter resolution is poorest and spatial averaging lengths are greatest in the underlying low-velocity region.  相似文献   

8.
A survey of members of the U.K.QSAR Discussion Group has been made to determine the extent ofuse and development of chemometric and artificial intelligence(AI)methods in the analysis ofmultivariate quantitative structure-activity relationship(QSAR)data in the U.K.Chemometric methodswere found to be well established in both industrial and educational establishments and there wassignificant method development occurring.AI methods were not employed to any great extent and thegeneral level of interest in these techniques was low compared to chemometric methods.A requirementfor more education in multivariate statistical methods and regression methods was indicated.A need fora user-friendly,comprehensive,commercially available multivariate statistical package containingmultivariate stability testing and regression diagnostic methods was identified.  相似文献   

9.
史文娇  张沫 《地理学报》2022,77(11):2890-2901
土壤粒径(砂粒、粉粒和黏粒)是各种陆表过程和生态系统服务评估等模型的关键参数。作为一种土壤成分数据,土壤粒径的空间预测方法有和为1(或100%)等特殊要求,其空间分布精度受预测方法影响较大。本文针对土壤粒径相较于其他土壤属性的特殊性,提出了土壤粒径空间预测方法框架,综述了土壤粒径数据变换、空间插值和精度验证等系列方法,总结了提升土壤粒径空间预测精度的各种途径,包括通过有效的数据变换改善数据分布、结合数据分布特点选择合适的预测方法、结合辅助变量提升制图精度和分布合理性、使用混合模型提升插值精度、使用多成分联合模拟模型提升预测的系统性等。最后,提出了今后土壤粒径空间预测方法研究的未来方向,包括从考虑数据变换原理和机制角度改善数据分布、发展多成分联合模拟模型和高精度曲面建模方法,以及引入土壤粒径函数曲线并与随机模拟结合等。  相似文献   

10.
In this study, stream sediment geochemical data have been subjected to robust principal components analysis (RPCA) and singularity mapping (SM) to enhance and map significant multivariate geochemical anomalies (i.e., mineralization-related) in Ahar area, NW Iran. The RPCA was applied to (a) account for the compositional nature of stream sediment geochemical data using suitable log-ratio transformation, (b) modulate the effect of outliers in component estimation and (c) derive a multivariate geochemical footprint of mineralization. The SM was applied to extract anomalous patterns of the multivariate geochemical footprint of mineralization. The exploration targets were then delineated using Student’s t-statistics analysis. The correlations of mapped exploration targets with the known mineral occurrences and mineralization-related patterns were further evaluated using normalized density index and overall accuracy analyses.  相似文献   

11.
Triggered by urbanization and changing land use, coastal transformation is a rapidly increasing phenomenon in the global south, driving dramatic livelihoods impacts. However, the existing literature on small-scale fisheries (SSF) has paid little attention to the way coastal transformations shape conditions for SSF livelihoods communities. This study proposes a new orientation in SSF studies by exploring the assemblage of entangled sociomaterial processes that account for coastal transformations by investigating waterfront transformation in a fishing community in Karnataka, India. Drawing on ethnographic fieldwork, we conclude that an entanglement of sociomaterial processes produces unequal outcomes among stakeholders that subsequently reinforce the political and economic marginalization of certain groups of waterfront users. Moreover, the investigated context-specific waterfront assemblage intimately connects to the broader context of national fishery policy, urbanization, and tourism, directing the way coastal space can and should be transformed. Such an analysis contributes to the understanding of changing livelihoods in SSF communities.  相似文献   

12.
空间信息分析技术   总被引:31,自引:5,他引:26  
在GIS技术日趋成熟和空间数据极大丰富的今天,通过分析空间数据探索空间过程机理正变得日益迫切。空间信息分析技术至少包括以下六个主要方面:(1)空间数据获取和预处理;(2)属性数据空间化和空间尺度转换;(3)空间信息探索分析;(4)地统计;(5)格数据分析;(6)复杂信息反演和预报。本文提出了解决具体应用问题一般的空间数据分析计算、结果解释和反馈程序。认为空间过程的一般共性和作为共同的研究对象,各种不同的方法技术最终可能导致空间数学(spatialmathematics)的产生,同时发展鲁棒的空间分析软件包对于普及空间数学是必要的。  相似文献   

13.
A statistical analysis of two consecutive sequences of observations on radiolarian abundances in the western North Pacific, by methods appropriate to data on the simplex (i.e., compositional data), show that although the overall graphical presentations of the frequencies appear similar, there are substantial differences in the earlier part of each of the series. The results of the multivariate analyses are used for identifying those species that contribute most to the analysis. A brief guide to the mathematical properties of compositional data is given.  相似文献   

14.
Two modern machine learning techniques, Linear Programming Boosting (LPBoost) and Support Vector Machines (SVMs), are introduced and applied to a geochemical dataset of niobium–tantalum (“coltan”) ores from Central Africa to demonstrate how such information may be used to distinguish ore provenance, i.e., place of origin. The compositional data used include uni- and multivariate outliers and elemental distributions are not described by parametric frequency distribution functions. The “soft margin” techniques of LPBoost and SVMs can be applied to such data. Optimization of their learning parameters results in an average accuracy of up to c. 92%, if spot measurements are assessed to estimate the provenance of ore samples originating from two geographically defined source areas. A parameterized performance measure, together with common methods for its optimization, was evaluated to account for the presence of uneven datasets. Optimization of the classification function threshold improves the performance, as class importance is shifted towards one of those classes. For this dataset, the average performance of the SVMs is significantly better compared to that of LPBoost.  相似文献   

15.
空间数据和地理信息系统在城市规划和决策中应用的重要性日见凸显。主要原因在于:重要的人口数据和社会变动经常表现出一定的空间特性,这种特性可以通过空间分析和统计方法被认识和解释。应用多元分析的空间分类方法编制圣保罗大都市区社会分异地图并进行相关分析。研究的主要数据来自2000年巴西全国人口普查,其中包括了圣保罗大都市的所有行政区和39个自治市的21774个人口普查区。为了把都市连绵区从数据全集中分离出来,我们采用混合技术进行互补分析,即在2000年4月30日的陆地卫星7号图像中绘制一个个多边形,这些被识别出来的多边形就是人口普查区。然后,通过目视解译出假彩色多边形集合。应用空间分类评分程序将这些多边形分成五类,并建立人口普查区的数目、覆盖的面积和都市连绵区之间的关系。这种多元分析方法是基于变量的均衡化来生成易于用分级统计图描述平均值,以促进可视化和后续的空间分布分析。基于多元分析的空间分类方法研究,清楚地展现了圣保罗大都市最重要的社会特征,也说明城市社会地图方法和多元分析的空间分类方法在大都市区的管理、公共政策规划和复杂决策中具有重要的应用价值。  相似文献   

16.
Different calibration methods and data manipulations are being employed for quantitative paleoenvironmental reconstructions, but are rarely compared using the same data. Here, we compare several diatom-based models [weighted averaging (WA), weighted averaging with tolerance-downweighting (WAT), weighted averaging partial least squares, artificial neural networks (ANN) and Gaussian logit regression (GLR)] in different situations of data manipulation. We tested whether log-transformation of environmental gradients and square-root transformation of species data improved the predictive abilities and the reconstruction capabilities of the different calibration methods and discussed them in regard to species response models along environmental gradients. Using a calibration data set from New England, we showed that all methods adequately modelled the variables pH, alkalinity and total phosphorus (TP), as indicated by similar root mean square errors of prediction. However, WAT had lower performance statistics than simple WA and showed some unusual values in reconstruction, but setting a minimum tolerance for the modern species, such as available in the new computer program C2 version 1.4, resolved these problems. Validation with the instrumental record from Walden Pond (Massachusetts, USA) showed that WA and WAT reconstructed most closely pH and that GLR reconstructions showed the best agreement with measured alkalinity, whereas ANN and GLR models were superior in reconstructing the secondary gradient variable TP. Log-transformation of environmental gradients improved model performance for alkalinity, but not much for TP. While square-root transformation of species data improved the performance of the ANN models, they did not affect the WA models. Untransformed species data resulted in better accordance of the TP inferences with the instrumental record using WA, indicating that, in some cases, ecological information encoded in the modern and fossil species data might be lost by square-root transformation. Thus it may be useful to consider different species data transformations for different environmental reconstructions. This study showed that the tested methods are equally suitable for the reconstruction of parameters that mainly control the diatom assemblages, but that ANN and GLR may be superior in modelling a secondary gradient variable. For example, ANN and GLR may be advantageous for modelling lake nutrient levels in North America, where TP gradients are relatively short.  相似文献   

17.
The direct trilinear decomposition method(DTDM)is an algorithm for performing quantitative curveresolution of three-dimensional data that follow the so-called trilinear model,e.g.chromatography-spectroscopy or emission-excitation fluorescence.Under certain conditions complexeigenvalues and eigenvectors emerge when the generalized eigenproblem is solved in DTDM.Previouspublications never treated those cases.In this paper we show how similarity transformations can be usedto eliminate the imaginary part of the complex eigenvalues and eigenvectors,thereby increasing theusefulness of DTDM in practical applications.The similarity transformation technique was first used byour laboratory to solve the similar problem in the generalized rank annihilation method(GRAM).Because unique elution profiles and spectra can be derived by using data matrices from three or moresamples simultaneously,DTDM with similarity transformations is more efficient than GRAM in the casewhere there are many samples to be investigated.  相似文献   

18.
基于表观电导率与实测光谱的干旱区湿地土壤盐分监测   总被引:2,自引:0,他引:2  
以新疆艾比湖滨盐渍化土壤为对象,利用磁感应电导仪和光谱仪测得的盐渍土表观电导率和可见光/近红外光谱数据,选取与EM38解译的土壤盐分相关性最好的光谱变换形式和特征波长,分别建立多元逐步回归、偏最小二乘回归和支持向量回归的土壤盐分监测模型。结果表明:(1)表观电导率两种模式相结合建立的盐分含量解译模型的拟合优度达到0.91,即在该区域内电磁感应技术可用于土壤盐分含量的间接监测。(2)一阶微分处理优于二阶微分,经一阶微分变换后的光谱可以较好地预测土壤盐分含量。(3)3种建模方法中,支持向量回归的建模精度最高,偏最小二乘回归和多元逐步回归次之。干旱区湖滨湿地土壤盐分含量的估测模型宜选取基于平滑后的原始一阶微分光谱数据建立的支持向量回归模型。  相似文献   

19.
Research on processing geochemical data and identifying geochemical anomalies has made important progress in recent decades. Fractal/multi-fractal models, compositional data analysis, and machine learning (ML) are three widely used techniques in the field of geochemical data processing. In recent years, ML has been applied to model the complex and unknown multivariate geochemical distribution and extract meaningful elemental associations related to mineralization or environmental pollution. It is expected that ML will have a more significant role in geochemical mapping with the development of big data science and artificial intelligence in the near future. In this study, state-of-the-art applications of ML in identifying geochemical anomalies were reviewed, and the advantages and disadvantages of ML for geochemical prospecting were investigated. More applications are needed to demonstrate the advantage of ML in solving complex problems in the geosciences.  相似文献   

20.
Images can contain chemical information and many chemical methods can generate image data. For anefficient extraction of chemical data from images, data analysis techniques are necessary, It is a greatadvantage to be able to work on multivariate images. Many imaging techniques allow the extraction ofchemical information. Inorganic analytical chemistry seems to have the longest tradition here, butorganic chemistry and biochemistry may soon be catching up. Also large data arrays from non-imagingtechniques can be combined with image analysis in a useful way, provided certain conditions are fulfilled.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号