Dimensionality reduction for data of unknown cluster structure
PBN-AR
Instytucja
Instytut Podstaw Informatyki Polskiej Akademii Nauk
Informacje podstawowe
Główny język publikacji
angielski
Czasopismo
INFORMATION SCIENCES
ISSN
0020-0255
EISSN
Wydawca
ELSEVIER SCIENCE INC
DOI
URL
Rok publikacji
2016
Numer zeszytu
Vol. 330
Strony od-do
74–87
Numer tomu
Identyfikator DOI
Liczba arkuszy
Autorzy
Pozostali autorzy
+ 1
Słowa kluczowe
angielski
Dimensionality reduction
Gaussian mixture model
Fisher’s subspace
Principal component analysis
Streszczenia
Język
angielski
Treść
Dimensionality reduction that preserves certain characteristics of data is needed for numerous reasons. In this work we focus on data coming from a mixture of Gaussian distributions and we propose a method that preserves the distinctness of the clustering structure, although this structure is assumed to be yet unknown. The rationale behind the method is the following: (i) had one known the clusters (classes) within the data, one could facilitate further analysis and reduce space dimensionality by projecting the data to the Fisher’s linear subspace, which — by definition — best preserves the structure of the given classes; (ii) under some reasonable assumptions, this can be done, albeit approximately, without prior knowledge of the clusters (classes). In this paper, we show how this approach works. We present a method of preliminary data transformation that brings the directions of largest overall variability close to the directions of the best between-class separation. Hence, for the transformed data, simple PCA provides an approximation to the Fisher’s subspace. We show that the transformation preserves the distinctness of the unknown structure in the data to a great extent.
Inne
System-identifier
PX-57a9f7c9c2dce0b031a610dc
CrossrefMetadata from Crossref logo
Cytowania
Liczba prac cytujących tę pracę
Brak danych
Referencje
Liczba prac cytowanych przez tę pracę
Brak danych