User Settings
Dissertation

Extensions of principal components analysis

1

TL;DRAbstract

Principal Components Analysis is a standard tool in data analysis, widely used in data-rich fields such as computer vision, data mining, bioinformatics, and econometrics. For a set of vectors in Rn number k < n, the method returns a subspace of dimension k whose average squared distance to that set is as small as possible. Besides saving computation by reducing the dimension, projecting to this subspace can often reveal structure that was hidden in high dimension. This thesis considers several novel extensions of PCA, which provably reveals hidden structure where standard PCA fails to do so. First, we consider Robust PCA, which prevents a few points, possibly corrupted by an adversary, from having a large effect on the analysis. The key idea is to alternate noise removal with projection to a constant fraction of the dimensions. When applied to noisy mixture models, the algorithm finds a subspace that is close to the pair of means that are furthest apart. By choosing and testing rando

Chat with Paper

AI Agents for this Paper

Principal Components Analysis is a standard tool in data analysis, widely used in data-rich fields such as computer vision, data mining, bioinformatics, and econometrics. For a set of vectors in Rn number k < n, the method returns a subspace of dimension k whose average squared distance to that set is as small as possible. Besides saving computation by reducing the dimension, projecting to this subspace can often reveal structure that was hidden in high dimension. This thesis considers several novel extensions of PCA, which provably reveals hidden structure where standard PCA fails to do so. First, we consider Robust PCA, which prevents a few points, possibly corrupted by an adversary, from having a large effect on the analysis. The key idea is to alternate noise removal with projection to a constant fraction of the dimensions. When applied to noisy mixture models, the algorithm finds a subspace that is close to the pair of means that are furthest apart. By choosing and testing rando

Keywords

Subspace topologyPrincipal component analysisAlgorithmDimension (graph theory)HyperplaneMathematicsAffine transformationComputation

Chat

Click to start Chat