首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 641 毫秒
1.

Compositional data carry their relevant information in the relationships (logratios) between the compositional parts. It is shown how this source of information can be used in regression modeling, where the composition could either form the response, or the explanatory part, or even both. An essential step to set up a regression model is the way how the composition(s) enter the model. Here, balance coordinates will be constructed that support an interpretation of the regression coefficients and allow for testing hypotheses of subcompositional independence. Both classical least-squares regression and robust MM regression are treated, and they are compared within different regression models at a real data set from a geochemical mapping project.

  相似文献   

2.
On the Interpretation of Orthonormal Coordinates for Compositional Data   总被引:1,自引:0,他引:1  
The simplex with the Aitchison geometry is a natural sample space for compositional data, that is, observations carrying only relative information (especially proportions, percentages, etc., often occurring in the geosciences). For this reason, standard statistical methods that rely on Euclidean structure of the real space cannot be used directly for statistical analysis. At first, compositional data need to be expressed in coordinates of an orthonormal basis on the simplex (with respect to the Aitchison geometry). The mathematical interpretation of the orthonormal coordinates is derived from the procedure by which they are constructed (called sequential binary partition), and they act as balances between groups of compositional parts. The goal of this paper is to describe the covariance structure of coordinates and, consequently, to provide a complementary interpretation based on log-ratios of parts of the original composition. It must be noted that, in a composition, the ratios themselves contain all the relevant information. The possibilities as well as the limitations of this approach are demonstrated through illustrative examples.  相似文献   

3.
Mathematical Geosciences - In the approach to compositional data analysis originated by John Aitchison, a set of linearly independent logratios (i.e., ratios of compositional parts, logarithmically...  相似文献   

4.
Groups of Parts and Their Balances in Compositional Data Analysis   总被引:7,自引:0,他引:7  
Amalgamation of parts of a composition has been extensively used as a technique of analysis to achieve reduced dimension, as was discussed during the CoDaWork'03 meeting (Girona, Spain, 2003). It was shown to be a non-linear operation in the simplex that does not preserve distances under perturbation. The discussion motivated the introduction in the present paper of concepts such as group of parts, balance between groups, and sequential binary partition, which are intended to provide tools of compositional data analysis for dimension reduction. Key concepts underlying this development are the established tools of subcomposition, coordinates in an orthogonal basis of the simplex, balancing element and, in general, the Aitchison geometry in the simplex. Main new results are: a method to analyze grouped parts of a compositional vector through the adequate coordinates in an ad hoc orthonormal basis; and the study of balances of groups of parts (inter-group analysis) as an orthogonal projection similar to that used in standard subcompositional analysis (intra-group analysis). A simulated example compares results when testing equal centers of two populations using amalgamated parts and balances; it shows that, in certain circumstances, results from both analysis can disagree.  相似文献   

5.
6.
Chemical reactions in aqueous geochemical systems are driven by nonequilibrium conditions, and their dynamics can be deduced through the distributional analysis (identification of probability laws) of complex compositional indices. In this perspective, compositional data analysis offers the possibility to investigate the behavior of the composition as a whole instead of isolated chemical species, with the awareness that multispecies systems are characterized by the simultaneous interactions among all their parts. We addressed this problem using D???1 isometric log-ratio coordinates describing the D compositional dataset of the river chemistry of the Alpine region (D number of variables), thus working in the \({{\mathbb{R}}^{D - 1}}\) statistical sample space. The D???1 coordinates were chosen using the decreasing variance criterion so that each one could provide information about different space–time properties for the investigated geochemical system. Coordinates dominated by heterogeneity appear to be able to capture regime shifts only on a long-time period and monitor processes on a very wide scale. On the other hand, coordinates characterized by lower variability present multimodality, thus capturing the presence of alternative states in the analyzed spatial domain also for the current time. Further developments are needed to determine the ranges of conditions for which variability and other statistics can be useful indicators of regime shifts on different time–space scales in geochemical systems.  相似文献   

7.
Correlation Analysis for Compositional Data   总被引:1,自引:0,他引:1  
Compositional data need a special treatment prior to correlation analysis. In this paper we argue why standard transformations for compositional data are not suitable for computing correlations, and why the use of raw or log-transformed data is neither meaningful. As a solution, a procedure based on balances is outlined, leading to sensible correlation measures. The construction of the balances is demonstrated using a real data example from geochemistry. It is shown that the considered correlation measures are invariant with respect to the choice of the binary partitions forming the balances. Robust counterparts to the classical, non-robust correlation measures are introduced and applied. By using appropriate graphical representations, it is shown how the resulting correlation coefficients can be interpreted.  相似文献   

8.
The statistical analysis of compositional data based on logratios of parts is not suitable when zeros are present in a data set. Nevertheless, if there is interest in using this modeling approach, several strategies have been published in the specialized literature which can be used. In particular, substitution or imputation strategies are available for rounded zeros. In this paper, existing nonparametric imputation methods—both for the additive and the multiplicative approach—are revised and essential properties of the last method are given. For missing values a generalization of the multiplicative approach is proposed.  相似文献   

9.
Compositional data analysis requires selecting an orthonormal basis with which to work on coordinates. In most cases this selection is based on a data driven criterion. Principal component analysis provides bases that are, in general, functions of all the original parts, each with a different weight hindering their interpretation. For interpretative purposes, it would be better to have each basis component as a ratio or balance of the geometric means of two groups of parts, leaving irrelevant parts with a zero weight. This is the role of principal balances, defined as a sequence of orthonormal balances which successively maximize the explained variance in a data set. The new algorithm to compute principal balances requires an exhaustive search along all the possible sets of orthonormal balances. To reduce computational time, the sets of possible partitions for up to 15 parts are stored. Two other suboptimal, but feasible, algorithms are also introduced: (i) a new search for balances following a constrained principal component approach and (ii) the hierarchical cluster analysis of variables. The latter is a new approach based on the relation between the variation matrix and the Aitchison distance. The properties and performance of these three algorithms are illustrated using a typical data set of geochemical compositions and a simulation exercise.  相似文献   

10.
Isometric Logratio Transformations for Compositional Data Analysis   总被引:37,自引:0,他引:37  
Geometry in the simplex has been developed in the last 15 years mainly based on the contributions due to J. Aitchison. The main goal was to develop analytical tools for the statistical analysis of compositional data. Our present aim is to get a further insight into some aspects of this geometry in order to clarify the way for more complex statistical approaches. This is done by way of orthonormal bases, which allow for a straightforward handling of geometric elements in the simplex. The transformation into real coordinates preserves all metric properties and is thus called isometric logratio transformation (ilr). An important result is the decomposition of the simplex, as a vector space, into orthogonal subspaces associated with nonoverlapping subcompositions. This gives the key to join compositions with different parts into a single composition by using a balancing element. The relationship between ilr transformations and the centered-logratio (clr) and additive-logratio (alr) transformations is also studied. Exponential growth or decay of mass is used to illustrate compositional linear processes, parallelism and orthogonality in the simplex.  相似文献   

11.
The current theoretical development of the analysis of compositional data in the article by Aitchison and Egozcue neglects the use of Harker’s variation diagrams and other similar plots as “meaningless” or “useless” on compositional data. In this work, it is shown that variation diagrams essentially are not a correlation tool but a graphical representation of the mass actions and mass balances principles in the context of a given geological system, and, when they are used correctly, they provide vital information for the igneous petrologist. The qualitative validity of the “spurious trends” in these diagrams is also shown, when they are interpreted in their proper geological framework. The example previously used by Rollinson to test the usefulness of the log-ratio transformation in the Aitchison and Egozcue article is revisited here in order to fully illustrate the proper use of this tool.  相似文献   

12.
In recognizing that a composition, such as a major oxide or sediment composition, provides information only about the relative, not the absolute, magnitudes of its components, this paper exposes the compositional variation array as the simplest and minimum way of summarizing the pattern of variability within a compositional data set. Such summaries are free of the notorious hazards of the constant-sum constraint and when depicted in relative variation diagrams can often provide substantial insights into the nature of the compositional variability. Concepts and practice are illustrated by reference to a number of real data sets.  相似文献   

13.
This paper addresses three intractable difficulties associated with the statistical analysis of compositional data, such as percentages or ppm. These are: (1) that such data do not follow multivariate normal distributions thus rendering inappropriate, standard parametric statistical tests and estimation procedures, (2) the covariance/correlation coefficients between specific pairs of components are determined in whole or in part by the presence or absence of other components, and, (3) the negative bias property. That is, at least one covariance and therefore at least one correlation, must be negative, hence the remaining correlations are prevented from ranging freely between ?1 and +1. It follows that correlation coefficients formed from compositional data are not only not absolute, but also frequently spurious. Standard multivariate procedures based on them are unreliable, and intrinsic associations between components inferred from strong positive correlations in particular, are potentially false. In a recent 2009 paper, it was reported that 59 surface sediment samples from 7 regions in the Polish exclusive economic zone had been chemically analyzed for 16 elements. Enrichment factors together with crude correlation coefficients between selected elements were presented. All these quantities were computed from the initial raw compositional data resulting from the chemical analyses In this paper, a statistical procedure is presented which is distinctly different to the enrichment factor computations based on the same raw compositional data. The procedure generates a log-ratio measure of the abundance of each element in each of the seven regions, thus enabling comparisons of relative levels of pollution between the regions. Although the two techniques are quite unrelated, it is shown that in general, extremely high or low measures of the relative abundances in the regions are associated with correspondingly high or low values of the enrichment factors in the same regions that were reported in the 2009 paper. That is, the statistical analysis confirms the results of the enrichment factor data in the identification of the most to the least polluted regions. In an additional analysis, the residue term was excluded from each sediment sample by rescaling the 16 element concentrations to sum to 100%, thus forming 59 residue-free sub-compositions. Crude correlation coefficients were computed for pairs of elements of this sub-compositional data. These revealed that certain correlations based on the initial raw data that were reported in the 2009 paper for the same pairs of elements, were not only inconsistent, but sometimes also contradictory. Such contradictions imply that intrinsic geochemical element associations inferred in that paper from such correlations were false.  相似文献   

14.
The log ratio methodology converts compositional data, such as concentrations of chemical elements in a rock, from their original Aitchison geometry to interpretable real orthonormal coordinates, thereby allowing meaningful statistical processing and visualization. However, it must be taken into account that the original concentrations can be flawed by detection limit or imprecision problems that can severely affect the resulting coordinates. This paper aims to construct such orthonormal log ratio coordinates, called weighted pivot coordinates, that capture the relevant relative information about an original component and treat the redundant information in a controlled manner. Theoretical developments are supported by a thorough simulation study. Weighted pivot coordinates are then applied to the geochemical mapping of catchment outlet sediments from the National Geochemical Survey of Australia illustrating their advantage over possible alternatives.  相似文献   

15.
Endogenic events in the form of intrusive activity and regional metamorphism developed asynchronously in various parts in the Svecofennian Orogen of Fennoscandia. The Early and Late Svecofennian stages of regional high-temperature metamorphism and related plutonism are distinguished from isotopic evidence. The composition, structural features, and asynchronous peaks of endogenic activity within the orogen indicate that at least two zones (inner and outer) should be distinguished in the Svecofennides. The lateral heterogeneity of the orogen in present-day coordinates is traced southward from the margin of the Archean craton. The conjugation zone of the Svecofennian Orogen and the Archean Karelian Craton is characterized by transition from negative to positive εNd (1.9 Ga) values as evidence for a decreased contribution of Archean crustal material to the source of Proterozoic granitoids from the north toward the Proterozoic domain in the south. With allowance for lateral compositional and isotopic heterogeneity of the Svecofennian Orogen and asynchronous culmination of endogenic events in different parts of this orogen, a new scheme of tectonic regionalization has been proposed.  相似文献   

16.
Compositional Geometry and Mass Conservation   总被引:1,自引:0,他引:1  
A geometrical structure is imposed on compositional data by physical and chemical laws, principally mass conservation. Therefore, statistical or mathematical investigation of possible relations between data values and such laws must be consistent with this structure. This demands that geometrical concepts, such as points that specify both mass and composition in linear space, and lines in projective space that specify composition only, be clearly defined and consistent with mass conservation. Mass thus becomes the norm in composition space in place of the Euclidean norm of ordinary space. Coordinate transformations inconsistent with this geometry are accordingly unnatural and misleading. They are also unnecessary because correlation arising from the constant mass presents no unusual difficulty in the analysis of the underlying quadratic form.  相似文献   

17.
The aim of this contribution is to explore the relationship among some concepts, often considered to be unrelated, such as weathering reactions, compositional data and fractals by means of distribution analysis.Weathering reactions represent the necessary transfer of heat and entropy to the environment in geochemical cycles. Compositional data express the relative abundance of chemical elements/species in a given total (i.e. volume or weight). Fractals are temporal or spatial objects with self-similarity and scale-invariance, so that internal structures repeat themselves over multiple levels of magnification or scales of measurement.Gibbs's free energy and the application of the Law Mass Action can be used to model weathering reactions, under the hypothesis of chemical equilibrium. Compositional data are obtained in the analytical phase after the determination of the concentrations of chemicals in sampled solid, liquid or gaseous materials. Fractals can be measured by using their fractal dimensions.In this paper, the presence of fractal structures is observed when the frequency distribution of isometric log-ratio coordinates is investigated, showing the logarithm of the cumulative number of samples exceeding a certain coordinate value plotted against the coordinate value itself. Isometric log-ratio coordinates (or balances) were constructed by using the sequential binary partition (SBP) method. The balances were identified to maintain, as far as possible, the similarity with a corresponding weathering reaction affecting the Arno river catchment (Tuscany, central Italy) as described by the Law of Mass Action. The emergence of fractal structures indicates the presence of dissipative systems, which require complexity, large numbers of inter-connected elements and stochasticity.  相似文献   

18.
In this paper we show that thermodynamic forward modelling, using Gibbs energy minimisation with consideration of element fractionation into refractory phases and/or liberated fluids, is able to extract information about the complex physical and chemical evolution of a deeply subducted rock volume. By comparing complex compositional growth zonations in garnets from high-and ultra-high pressure samples with those derived from thermodynamic forward modelling, we yield an insight into the effects of element fractionation on composition and modes of the co-genetic metamorphic phase assemblage. Our results demonstrate that fractionation effects cause discontinuous growth and re-crystallisation of metamorphic minerals in high pressure rocks. Reduced or hindered mineral growth at UHP conditions can control the inclusion and preservation of minerals indicative for UHP metamorphism, such as coesite, thus masking peak pressure conditions reached in subducted rocks.Further, our results demonstrate that fractional garnet crystallisation leads to strong compositional gradients and step-like zonation patterns in garnet, a feature often observed in high-and ultra-high pressure rocks. Thermodynamic forward modelling allows the interpretation of commonly observed garnet growth zonation patterns in terms of garnet forming reactions and the relative timing of garnet growth with respect to the rock's pressure–temperature path. Such a correlation is essential for the determination of tectonic and metamorphic rates in subduction zones as well as for the understanding of trace element signatures in subduction related rocks. It therefore should be commonplace in the investigation of metamorphic processes in subduction zones.  相似文献   

19.
Differential Models for Evolutionary Compositions   总被引:1,自引:1,他引:0  
General systems are frequently decomposable into parts and these parts can evolve in time or space, a frequent occurrence in the field of Geosciences. In most cases, fitting models to forecast future states of the system is a goal of the analysis. Modelling interactions between parts may also be of common interest. The system can be analysed from different points of view; the traditional one consists in modelling each part of the system in time. Alternatively, modelling the evolution of the parts as proportions is proposed herein and attention is centred on the compositional evolution. The compositions are expressed in orthogonal coordinates (ilr) and then modelled using first-order differential equations with constant coefficients. Simple models are shown to be very flexible, including many of the standard growth curve models. The models are fitted using regression techniques on the integrated coordinates. The use and interpretation of these differential models is illustrated with several examples: a simulated example; urban waste in Catalonia (Spain); oil production and reserves; and growth of a luzonite crystal.  相似文献   

20.
This work focuses on the characterization of the central tendency of a sample of compositional data. It provides new results about theoretical properties of means and covariance functions for compositional data, with an axiomatic perspective. Original results that shed new light on geostatistical modeling of compositional data are presented. As a first result, it is shown that the weighted arithmetic mean is the only central tendency characteristic satisfying a small set of axioms, namely continuity, reflexivity, and marginal stability. Moreover, this set of axioms also implies that the weights must be identical for all parts of the composition. This result has deep consequences for spatial multivariate covariance modeling of compositional data. In a geostatistical setting, it is shown as a second result that the proportional model of covariance functions (i.e., the product of a covariance matrix and a single correlation function) is the only model that provides identical kriging weights for all components of the compositional data. As a consequence of these two results, the proportional model of covariance function is the only covariance model compatible with reflexivity and marginal stability.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号