Mixture model based multivariate statistical analysis of multiply censored environmental data |
| |
Institution: | 1. Department of Mining Engineering, Amirkabir University of Technology, Tehran, Iran;2. Faculty of Engineering, Malayer University, Malayer, Iran;3. International Atomic Energy Agency, Vienna International Center, PO Box 100, 1400, Vienna, Austria |
| |
Abstract: | Environmental data are commonly constrained by a detection limit (DL) because of the restriction of experimental apparatus. In particular due to the changes of experimental units or assay methods, the observed data are often cut off by more than one DL. Measurements below the DLs are typically replaced by an arbitrary value such as zeros, half of DLs, or DLs for convenience of analysis. However, this method is widely considered unreliable and prone to bias. In contrast, maximum likelihood estimation (MLE) method for censored data has been developed for better performance and statistical justification. However, the existing MLE methods seldom address the multivariate context of censored environmental data especially for water quality. This paper proposes using a mixture model to flexibly approximate the underlying distribution of the observed data due to its good approximation capability and generation mechanism. In particular, Gaussian mixture model (GMM) is mainly focused in this study. To cope with the censored data with multiple DLs, an expectation–maximization (EM) algorithm in a multivariate setting is developed. The proposed statistical analysis approach is verified from both the simulated data and real water quality data. |
| |
Keywords: | Water quality Gaussian mixture model Maximum likelihood estimation Censored data Detection limit |
本文献已被 ScienceDirect 等数据库收录! |
|