Geodesic PCA in the Wasserstein space

Abstract : We introduce the method of Geodesic Principal Component Analysis (GPCA) on the space of probability measures on the line, with finite second moment, endowed with the Wasserstein metric. We discuss the advantages of this approach, over a standard functional PCA of probability densities in the Hilbert space of square-integrable functions. We establish the consistency of the method by showing that the empirical GPCA converges to its population counterpart, as the sample size tends to infinity. A key property in the study of GPCA is the isometry between the Wasserstein space and a closed convex subset of the space of square-integrable functions, with respect to an appropriate measure. Therefore, we consider the general problem of PCA in a closed convex subset of a separable Hilbert space, which serves as basis for the analysis of GPCA and also has interest in its own right. We provide illustrative examples on simple statistical models, to show the benefits of this approach for data analysis. The method is also applied to a real dataset of population pyramids.
Type de document :
Article dans une revue
Annales de l'Institut Henri Poincaré (B) Probabilités et Statistiques, Institute Henri Poincaré, 2017, 53 (1), pp.1-26. 〈https://projecteuclid.org/euclid.aihp/1486544882#info〉
Liste complète des métadonnées

https://hal.archives-ouvertes.fr/hal-01973272
Contributeur : Thierry Klein <>
Soumis le : mercredi 16 janvier 2019 - 15:01:24
Dernière modification le : jeudi 21 février 2019 - 10:52:48

Identifiants

  • HAL Id : hal-01973272, version 1

Citation

Jérémie Bigot, Raul Gouet, Thierry Klein, Alfredo Lopez. Geodesic PCA in the Wasserstein space. Annales de l'Institut Henri Poincaré (B) Probabilités et Statistiques, Institute Henri Poincaré, 2017, 53 (1), pp.1-26. 〈https://projecteuclid.org/euclid.aihp/1486544882#info〉. 〈hal-01973272〉

Partager

Métriques

Consultations de la notice

10