In search of Non-Gaussian Components of a High-Dimensional Distribution
In high dimensional data analysis, finding non-Gaussian components is an important preprocessing step for efficient information processing. This article proposes a new linear method to identify the non- Gaussian subspace within a very general semi-parametric framework. Our proposed method NGCA (Non-Gaussian Component Analysis) is essentially based on the theoretical fact that, via an arbitrary nonlinear function, a vector which approximately belongs to the low dimensional non-Gaussian subspace can be constructed. Since different nonlinear functions yield different directions, one can obtain an approximate subspace from a set of different nonlinear functions. PCA is then applied to identify the non-Gaussian subspace. A numerical study demonstrates the usefulness of our method.