首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 578 毫秒
1.
On criteria for measures of compositional difference   总被引:4,自引:0,他引:4  
Simple perceptions about the nature of compositions lead through logical necessity to certain forms of analysis of compositional data. In this paper the consequences of essential requirements of scale, perturbation and permutation invariance, together with that of subcompositional dominance, are applied to the problem of characterizing change and measures of difference between two compositions. It will be shown that one strongly advocated scalar measure of difference fails these tests of logical necessity, and that one particular form of scalar measure of difference (the sum of the squares of all possible logratio differences in the components of the two compositions), although not unique, emerges as the simplest and most tractable satisfying the criteria.  相似文献   

2.
In large multi-element regional surveys statistically derived threshold levels of the form that define, for example, the top 2% of the data for each element as worthy of further investigation have led to the generation of inordinately large lists of geochemical samples for detailed study. This problem is compounded when a number of geological and secondary environments exists of sufficiently different character that separate thresholds should be estimated for each. Additionally, single-element thresholds for multi-element surveys can, in certain circumstances, lead to obviously out-of-character individuals not being recognized.Numerical approaches to the problem of anomaly recognition have commonly used a principal-component or regression analysis procedure as their basis. These, as indeed do all such approaches, have a common drawback in that the outliers being sought can distort the analysis being used to detect them. In addition, regression models have the further problem that there may be outliers in both the response and explanatory variables.A relatively simple approach would be to prepare a multivariate cumulative probability plot where each multi-element geochemical sample is represented as a single value. The resulting diagram would be interpreted much as a univariate probability plot where the presence of more than one straight-line segment is taken as evidence of multiple populations, and outliers as individuals or small groups are separated from the remaining data by gaps on the plot.Such a diagram may be prepared by plotting the rank-ordered values of the generalized or Mahalanobis distance, a multivariate distance measure, versus values of the chi-square statistic. This procedure is based on the covariance matrix of the data, a measure that underlies both principal-component and regression model approaches. In order to work effectively a statistically robust starting covariance matrix is essential.The procedure is described in detail with two examples, one a synthetic bivariate data set containing known outliers, and the other a small, well studied stream sediment data set from Norway extensively used in methodological comparison studies. The result of the procedure is to identify statistical outliers, which are candidates for interpretation as true geochemical anomalies, and to isolate a multi-element subset that is representative of the geochemical background.  相似文献   

3.
Data selected from an extensive major element database of Cenozoic volcanic rocks (including calc-alkaline andesites, dacites, rhyolites, and alkali basalts) of Hungary are used to illustrate the detection and modeling of subcompositional patterns using a statistical analysis based on the assumption that relative differences between the observed values are more meaningful than absolute ones. In particular, two roughly linear compositional patterns (associated one to the alkaline basalts, the other to the calc-alkaline series) are revealed and evaluated, and it is shown how principal component analysis can be used to obtain the estimated subcomposition of their incidental intersection point.  相似文献   

4.
Logratios and Natural Laws in Compositional Data Analysis   总被引:1,自引:0,他引:1  
The impossibility of interpreting correlations of raw compositional components and associated statistical methods has been clearly demonstrated over the last four decades and alternative statistical methodology developed. Despite this a return to the traditional use of raw components has been advocated recently and alternative methodology such as logratio analysis strongly criticized. This paper exposes the fallacies in this recent advocacy and demonstrates the constructive role that logratio analysis can play in geological compositional problems, in particular in the investigation of natural laws and in subcompositional investigations.  相似文献   

5.
It is mathematically possible to extract both R-mode and Q-mode factors simultaneously (RQ-mode factor analysis)by invoking the Eckhart-Young theorem. The resulting factors will be expressed in measures determined by the form of the scalings that have been applied to the original data matrix. Unless the measures for both solutions are meaningful for the problem at hand, the factor results may be misleading or uninterpretable. Correspondence analysis uses a symmetrical scaling of both rows and columns to achieve measures of proportional similarity between objects and variables. In the literature, the resulting similarity is a χ 2 distance appropriate for analysis of enumerated data, the original application of correspondence analysis. Justification for the use of this measure with interval or ratio data is unconvincing, but a minor modification of the scaling procedure yields the profile similarity, which is an appropriate measure. Symmetrical scaling of rows and columns is unnecessary for RQ-mode factor analysis. If the data are scaled so the minor product W'Wis the correlation matrix, the major product WW'is expressed in the Euclidean distances between objects. Therefore, RQ-mode factor analysis can be performed so that the Rmode is a principal components solution and the Qmode is a principal coordinates solution. For applications where the magnitudes of differences are important, this approach will yield more interpretable results than will correspondence analysis.  相似文献   

6.
Groups of Parts and Their Balances in Compositional Data Analysis   总被引:7,自引:0,他引:7  
Amalgamation of parts of a composition has been extensively used as a technique of analysis to achieve reduced dimension, as was discussed during the CoDaWork'03 meeting (Girona, Spain, 2003). It was shown to be a non-linear operation in the simplex that does not preserve distances under perturbation. The discussion motivated the introduction in the present paper of concepts such as group of parts, balance between groups, and sequential binary partition, which are intended to provide tools of compositional data analysis for dimension reduction. Key concepts underlying this development are the established tools of subcomposition, coordinates in an orthogonal basis of the simplex, balancing element and, in general, the Aitchison geometry in the simplex. Main new results are: a method to analyze grouped parts of a compositional vector through the adequate coordinates in an ad hoc orthonormal basis; and the study of balances of groups of parts (inter-group analysis) as an orthogonal projection similar to that used in standard subcompositional analysis (intra-group analysis). A simulated example compares results when testing equal centers of two populations using amalgamated parts and balances; it shows that, in certain circumstances, results from both analysis can disagree.  相似文献   

7.
This paper is part of a larger research program which employs a mixed-methods approach to study the determinants of health at the local level using specific neighborhoods in Hamilton, Ontario, Canada. In this paper, multivariate, spatial statistical techniques and geographic information systems are used to address questions about the characterization of neighbourhoods, based on socioeconomic determinants of health and risk factors such as smoking. While neighbourhood characterization has been a component of public health surveillance for some time, geostatistical techniques can now be used to derive more accurate representation of neighbourhoods for use in subsequent analysis. We utilize principal components analysis to reduce the data and extract the components that represent the underlying local processes. Principal components are also overlayed on comparative mortality figures to visualize where the socio-demographic determinants of health correspond spatially with mortality patterns. Predicted values from the components are then analysed for spatial clustering using local indicators of spatial association. The findings reveal a pattern of distinct neighbourhoods that will be used in subsequent quantitative and qualitative stages in the larger research programme. The results can also be used to inform public health policy and to target public health interventions.  相似文献   

8.

Compositional data carry their relevant information in the relationships (logratios) between the compositional parts. It is shown how this source of information can be used in regression modeling, where the composition could either form the response, or the explanatory part, or even both. An essential step to set up a regression model is the way how the composition(s) enter the model. Here, balance coordinates will be constructed that support an interpretation of the regression coefficients and allow for testing hypotheses of subcompositional independence. Both classical least-squares regression and robust MM regression are treated, and they are compared within different regression models at a real data set from a geochemical mapping project.

  相似文献   

9.
Stephen Morse 《Geoforum》2005,36(5):625-640
Pressing global environmental problems highlight the need to develop tools to measure progress towards “sustainability.” However, some argue that any such attempt inevitably reflects the views of those creating such tools and only produce highly contested notions of “reality.” To explore this tension, we critically assesses the Environmental Sustainability Index (ESI), a well-publicized product of the World Economic Forum that is designed to measure ‘sustainability’ by ranking nations on league tables based on extensive databases of environmental indicators. By recreating this index, and then using statistical tools (principal components analysis) to test relations between various components of the index, we challenge ways in which countries are ranked in the ESI. Based on this analysis, we suggest (1) that the approach taken to aggregate, interpret and present the ESI creates a misleading impression that Western countries are more sustainable than the developing world; (2) that unaccounted methodological biases allowed the authors of the ESI to over-generalize the relative ‘sustainability’ of different countries; and, (3) that this has resulted in simplistic conclusions on the relation between economic growth and environmental sustainability. This criticism should not be interpreted as a call for the abandonment of efforts to create standardized comparable data. Instead, this paper proposes that indicator selection and data collection should draw on a range of voices, including local stakeholders as well as international experts. We also propose that aggregating data into final league ranking tables is too prone to error and creates the illusion of absolute and categorical interpretations.  相似文献   

10.
A discrete entropy-based approach is used to assess the groundwater monitoring network that exists in Kodaganar River basin of Southern India. Since any monitoring system is essentially an information collection system, its technical design and evaluation require a quantifiable measure of information and this measure can be derived using entropy. The use of information-based measures of groundwater table shows that the existing monitoring network contains a sufficient number of wells but is not well designed for the measurement of groundwater level. Entropy-based results show that 15 wells are vital to measure regional groundwater level, not 28 wells which are being monitored effectively in this basin.  相似文献   

11.
It is common practice in compositional data analysis to perform the log-ratio transformation in order to preserve sub-compositional coherence in the analysis. Correspondence analysis is an alternative approach to analyzing ratio-scale data and is often contrasted with log-ratio analysis. It turns out that if one introduces a power transformation into the correspondence analysis algorithm, then the limit of the power-transformed correspondence analysis, as the power parameter tends to zero, is exactly the log-ratio analysis. Depending on how the power transformation is applied, we can obtain as limiting cases either Aitchison’s unweighted log-ratio analysis or the weighted form called “spectral mapping”. The upshot of this is that one can come as close as one likes to the log-ratio analysis, weighted or unweighted, using correspondence analysis.  相似文献   

12.
Quantifying long-term rates of chemical weathering and physical erosion is important for understanding the long-term evolution of soils, landscapes, and Earth's climate. Here we describe how long-term chemical weathering rates can be measured for actively eroding landscapes using cosmogenic nuclides together with a geochemical mass balance of weathered soil and parent rock. We tested this approach in the Rio Icacos watershed, Puerto Rico, where independent studies have estimated weathering rates over both short and long timescales. Results from the cosmogenic/mass balance method are consistent with three independent sets of weathering rate estimates, thus confirming that this approach yields realistic measurements of long-term weathering rates. This approach can separately quantify weathering rates from saprolite and from overlying soil as components of the total. At Rio Icacos, nearly 50% of Si weathering occurs as rock is converted to saprolite; in contrast, nearly 100% of Al weathering occurs in the soil. Physical erosion rates are measured as part of our mass balance approach, making it particularly useful for studying interrelationships between chemical weathering and physical erosion. Our data show that chemical weathering rates are tightly coupled with physical erosion rates, such that the relationship between climate and chemical weathering rates may be obscured by site-to-site differences in the rate that minerals are supplied to soil by physical erosion of rock. One can normalize for variations in physical erosion rates using the “chemical depletion fraction,” which measures the fraction of total denudation that is accounted for by chemical weathering. This measure of chemical weathering intensity increases with increasing average temperature and precipitation in data from climatically diverse granitic sites, including tropical Rio Icacos and six temperate sites in the Sierra Nevada, California. Hence, across a wide range of climate regimes, analysis of chemical depletion fractions appears to effectively account for site-to-site differences in physical erosion rates, which would otherwise obscure climatic effects on chemical weathering rates. Our results show that by quantifying rates of physical erosion and chemical weathering together, our mass balance approach can be used to determine the relative importance of climatic and nonclimatic factors in regulating long-term chemical weathering rates.  相似文献   

13.
This paper examines some aspects of the power and robustness of the test for complete subcompositional independence proposed by Aitchison (1982). Although the computed test statistics commonly do not approach being 2 distributed throughout their range, the upper tail of their distribution does mimic the 2 distribution sufficiently to yield a quite robust test when variates are drawn from identical distributions with different distribution parameters or even when variates are drawn from different distributions. But the magnitude of correlations among the variables and the proportion of correlated to independent variables that compose the closed data vectors affect the power of the test.  相似文献   

14.
刘海燕  刘财  刘洋  张营  高凤霞 《世界地质》2013,32(1):144-152
地震相干体技术可以有效地压制连续性,突出不连续性,比地震切片的地质解释更直观,能更细致地进行断层解释。基于Manhattan 距离的相干体技术,与传统的C1 相干算法进行比较,该技术不仅在断层识别能力方面要强于C1 相干算法,而且当利用两种算法获取断层信息的效果相当时,该技术的相关时窗长度要比C1 相干算法所用的相关时窗长度小,这恰能提高程序的运行速度,提高了地震相干体技术的运行效率。此外,该相干体技术也能近似计算出每道在纵横测线方向上的视时间倾角。  相似文献   

15.
Estimation of regionalized compositions: A comparison of three methods   总被引:1,自引:0,他引:1  
A regionalized composition is a random vector function whose components are positive and sum to a constant at every point of the sampling region. Consequently, the components of a regionalized composition are necessarily spatially correlated. This spatial dependence—induced by the constant sum constraint—is a spurious spatial correlation and may lead to misinterpretations of statistical analyses. Furthermore, the cross-covariance matrices of the regionalized composition are singular, as is the coefficient matrix of the cokriging system of equations. Three methods of performing estimation or prediction of a regionalized composition at unsampled points are discussed: (1) the direct approach of estimating each variable separately; (2) the basis method, which is applicable only when a random function is available that can he regarded as the size of the regionalized composition under study; (3) the logratio approach, using the additive-log-ratio transformation proposed by J. Aitchison, which allows statistical analysis of compositional data. We present a brief theoretical review of these three methods and compare them using compositional data from the Lyons West Oil Field in Kansas (USA). It is shown that, although there are no important numerical differences, the direct approach leads to invalid results, whereas the basis method and the additive-log-ratio approach are comparable.  相似文献   

16.
The accurate delineation of area plays a key role in the surveying of land change detection and the classification of land covers. In a hydrologic system, the watershed delineation and the detection of the boundaries among watershed is a basic method for performing spatial analyses. After recent advances in image processing and raster-based spatial analysis in geographic information systems, and being easily accessible data via various sources especially through remote sensing, the reliable determination of topographical boundaries possible is possible. Therefore, an integrated approach of data analysis and modeling can accomplish the task of delineation. The main aim in this research is to evaluate the delineation method of watershed boundary using four different digital elevation models (DEM) including advanced spaceborne thermal emission and reflection radiometer (ASTER), Shuttle Radar Topographic Mission (SRTM), digital topography, and topographic maps. In order to determine a true reference of boundary of watershed, sample data were also obtained by field survey and using global positioning system (GPS). The comparison reference points and the results of these data showed the average distance difference between reference boundary, and the result of ASTER data was 43 m. However, the average distance between GPS reference and the other data was high; the difference between the reference data and SRTM was 307 m, and for digital topographic map, it was 269 m. The average distance between topographic map and the GPS points differed 304 m as well. For the statistical analysis of comparison, the coordinates of 230 points were determined; the paired comparisons were also performed to measure the coefficient of determination, R 2, as well as analysis of variance in SPSS software. As a result, the R 2 values for the ASTER data with the digital topography and topographic map were 0.0157 and 0.171, respectively. The results showed that there were statistically significant differences in distances among the four means of the selected models. Therefore, considering other three methods, the ASTER DEM is the most suitable applicable data to delineate the borders of watersheds, especially in rugged terrains. In addition, the calculated flow directions of stream based on ASTER are close to natural tributaries as well as real positions of streams.  相似文献   

17.
In bench blast design, not only the technical and economical aspects, such as block size, uniformity and cost, but also the elimination of environmental problems resulting from ground vibration and air blast should be taken into consideration. Prediction of ground vibration components is of great importance when responding to and avoiding environmentally-related complaints. This paper presents the results of ground vibration measurements carried out in a celestite open-pit mine during blast optimisation studies. The particle velocity components (longitudinal, transversal, vertical and peak) and the airblast measurement results were evaluated considering the scaled distance relationship. The statistical analysis of 47 data sets yielded an empirical relationship between peak particle velocity and scaled distance. This approach which is suggested for the present site gives the 50% line and the upper bound 95% prediction limit with reasonable correlation.  相似文献   

18.
时间域航空电磁数据经预处理后,仍存在残余噪声,影响电磁探测对地下异常的识别能力。笔者提出一种基于最小噪声分离的去噪方法,将一组含噪电磁数据通过旋转矩阵线性变换为按照信噪比大小排列的最小噪声分离成分,利用信噪比较大的最小噪声分离成分重构电磁数据,以达到分离噪声的目的。仿真数据去噪结果表明:最小噪声分离不仅能够有效压制晚期道剖面噪声,还能准确分辨异常信息;晚期道信噪比较测线滤波提高了11.28 dB,实测数据的噪声水平也由±50 nT/s降低到±10 nT/s。  相似文献   

19.
Epistemic uncertainties arise during the estimation of hydraulic gradients in unconfined aquifers due to planar approximation of the water table as well as data gaps arising from factors such as instrument failures and site inaccessibility. A multidimensional fuzzy least-squares regression approach is proposed here to estimate hydraulic gradients in situations where epistemic uncertainty is present in the observed water table measurements. The hydraulic head at a well is treated as a normal (Gaussian) fuzzy variable characterized by a most likely value and a spread. This treatment results in hydraulic gradients being characterized as normal fuzzy numbers as well. The multidimensional fuzzy least-squares regression has an exact analytical form and as such can be implemented easily using matrix algebra methods. However, the method was noted to be sensitive to round-off and truncation errors when the epistemic uncertainties are small. A closeness index based on the cardinality of a fuzzy number is used to evaluate how well the regression model fits the fuzzy hydraulic head observations. A fuzzy Euclidian distance measure is used to compare two fuzzy numbers and to evaluate how fuzziness in the observed hydraulic heads affects the fuzziness in the estimated hydraulic gradients. The Euclidian distance measure is also used to ascertain the influence of each well on the fuzzy hydraulic gradient estimation. The fuzzy regression framework is illustrated by applying it to evaluate hydraulic gradients in the unconfined portion of the Gulf Coast aquifer in Goliad County, TX. The results from the case-study indicate that there is greater uncertainty associated with the estimation of the hydraulic gradients in the vertical (Z-axis) direction. The epistemic uncertainties in the hydraulic head data at the wells have a significant impact on the gradient estimates when they are of the same order of magnitude as the most likely values of the observed heads. The influence analysis indicated that 5 of the 13 wells in the network had a critical influence on at least one of the hydraulic gradients. Three wells along the northeastern section of the study area and bordering the Victoria County were noted to have the least influence on the regression estimates. The fuzzy regression framework along with the associated goodness-of-fit and influence measures provides a useful set of tools to characterize the uncertainties in the hydraulic heads and gradients arising from data gaps and planar water table approximation.  相似文献   

20.
A Parametric Approach for Dealing with Compositional Rounded Zeros   总被引:2,自引:0,他引:2  
In this work, a parametric approach for replacing data below the detection limit, also known as rounded zeros, in compositional data sets is proposed. Compositional rounded zeros correspond to small proportions of some whole that cannot be reliably detected by the analytical instruments under given operating conditions. This kind of zeros appear frequently in the data collection process in geosciences. They must be treated in an adequate way before some multivariate analysis can be applied. Our procedure results from a modification of the Expectation-Maximization (EM) algorithm and is based on the additive log-ratio transformation. Its coherence with the nature of compositional data and with basic operations in the simplex sample space is checked. Using real data sets, we find that this approach improves other parametric and non-parametric techniques for compositional rounded zeros.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号