Multidimensional scaling with very large datasets

Abstract : Multidimensional scaling has a wide range of applications when observations are not continuous but it is possible to define a distance (or dissimilarity) among them. However, standard implementations are limited when analyzing very large data sets because they rely on eigendecomposition of the full distance matrix and require very long computing times and large quantities of memory. Here, a new approach is developed based on projection of the observations in a space defined by a subset of the full data set. The method is easily implemented. A simulation study showed that its performance are satisfactory in different situations and can be run in a short time when the standard method takes a very long time or cannot be run because of memory requirements.
Type de document :
Article dans une revue
Liste complète des métadonnées

Littérature citée [13 références]  Voir  Masquer  Télécharger

http://hal.ird.fr/ird-01920130
Contributeur : Emmanuel Paradis <>
Soumis le : jeudi 11 juillet 2019 - 08:48:52
Dernière modification le : vendredi 12 juillet 2019 - 01:18:50

Fichier

Paradis_2018_JCGS.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Collections

Citation

Emmanuel Paradis. Multidimensional scaling with very large datasets. Journal of Computational and Graphical Statistics, Taylor & Francis, 2018, pp.1 - 5. ⟨10.1080/10618600.2018.1470001⟩. ⟨ird-01920130⟩

Partager

Métriques

Consultations de la notice

116

Téléchargements de fichiers

39