Representing outliers for improved multi-spectral data reduction

Farnaz Agahian1, Brian Funt1, Seyed Hossein Amirshahi2
1Simon Fraser University (Canada), 2Amirkabir University of Technology (Iran)
Download paper

Play (22min)

Download: MP4 | MP3

Large multi-spectral datasets such as those created by multi-spectral images require a lot of data storage. Compression of these data is therefore an important problem. A common approach is to use principal components analysis (PCA) as a way of reducing the data requirements as part of a lossy compression strategy. In this paper, we employ the fast MCD (Minimum Covariance Determinant) algorithm, as a highly robust estimator of multivariate mean and covariance, to detect outlier spectra in a multi-spectral image. We then show that by removing the outliers from the main dataset, the performance of PCA in spectral compression significantly increases. However, since outlier spectra are a part of the image, they cannot simply be ignored. Our strategy is to cluster the outliers into a small number of groups and then compress each group separately using its own cluster-specific PCA-derived bases. Overall, we show that significantly better compression can be achieved with this approach.