首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Testing for Multivariate Outliers in the Presence of Missing Data
Authors:W A Woodward  S R Sain  H L Gray  B Zhao  M D Fisk
Institution:(1) Lake Erie Biological Station, US Geological Survey, Great Lakes Science Center, 6100 Columbus Avenue, Sandusky, OH 44870, USA;(2) Fertility Center of Las Vegas, Las Vegas, NV 89119, USA
Abstract:—?We consider the problem of multivariate outlier testing for purposes of distinguishing seismic signals of underground nuclear events from training samples based on non-nuclear seismic events when certain data are missing. We consider the case in which the training data follow a multivariate normal distribution. Assume a potential outlier is observed on which k features of interest are measured. Assume further that the available training set of n observations on these k features is available but that some of the observations in the training data have missing features. The approach currently used in practice is to perform the outlier testing using a generalized likelihood ratio test procedure based only on the data vectors in the training data with complete data. When there is a substantial amount of missing data within the training set, use of this strategy may lead to a loss of valuable information. An alternative procedure is to incorporate all n of the data vectors in the training data using the EM algorithm to appropriately handle the missing data in the training set. Resampling methods are used to find appropriate critical regions. We use simulation results and analysis of models fit to Pg/Lg ratios for the WMQ station in China to compare these two strategies for dealing with missing data.
Keywords:
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号