Geodesic PCA in the Wasserstein space by convex PCA

Abstract : We introduce the method of Geodesic Principal Component Analysis (GPCA) on the space of probability measures on the line, with finite second moment, endowed with the Wasserstein metric. We discuss the advantages of this approach, over a standard functional PCA of probability densities in the Hilbert space of square-integrable functions. We establish the consistency of the method by showing that the empirical GPCA converges to its population counterpart, as the sample size tends to infinity. A key property in the study of GPCA is the isometry between the Wasserstein space and a closed convex subset of the space of square-integrable functions, with respect to an appropriate measure. Therefore, we consider the general problem of PCA in a closed convex subset of a separable Hilbert space, which serves as basis for the analysis of GPCA and also has interest in its own right. We provide illustrative examples on simple statistical models, to show the benefits of this approach for data analysis. The method is also applied to a real dataset of population pyramids.
Complete list of metadatas

Cited literature [30 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01978864
Contributor : Thierry Klein <>
Submitted on : Friday, January 11, 2019 - 7:04:50 PM
Last modification on : Monday, April 29, 2019 - 3:26:54 PM

File

AIHP706.pdf
Publisher files allowed on an open archive

Identifiers

Citation

Jérémie Bigot, Raul Gouet, Thierry Klein, Alfredo Lopez. Geodesic PCA in the Wasserstein space by convex PCA. Annales de l'Institut Henri Poincaré (B) Probabilités et Statistiques, Institute Henri Poincaré, 2017, 53 (1), pp.1-26. ⟨https://projecteuclid.org/euclid.aihp/1486544882#info⟩. ⟨10.1214/15-aihp706⟩. ⟨hal-01978864⟩

Share

Metrics

Record views

38

Files downloads

101