An Algorithm for Reducing Dimension and Size of Sample for Data Exploration Procedures
PBN-AR
Instytucja
Instytut Badań Systemowych Polskiej Akademii Nauk
Informacje podstawowe
Główny język publikacji
EN
Czasopismo
INTERNATIONAL JOURNAL OF APPLIED MATHEMATICS AND COMPUTER SCIENCE
ISSN
1641-876X
EISSN
2083-8492
Wydawca
Walter de Gruyter GmbH
DOI
Rok publikacji
2014
Numer zeszytu
1
Strony od-do
133-149
Numer tomu
24
Identyfikator DOI
Liczba arkuszy
1
Słowa kluczowe
EN
dimension reduction
sample size reduction
linear transformation
simulated annealing
data mining
Streszczenia
Język
EN
Treść
The paper deals with the issue of reducing the dimension and size of a data set (random sample) for exploratory data analysis procedures. The concept of the algorithm investigated here is based on linear transformation to a space of a smaller dimension, while retaining as much as possible the same distances between particular elements. Elements of the transformation matrix are computed using the metaheuristics of parallel fast simulated annealing. Moreover, elimination of or a decrease in importance is performed on those data set elements which have undergone a significant change in location in relation to the others. The presented method can have universal application in a wide range of data exploration problems, offering flexible customization, possibility of use in a dynamic data environment, and comparable or better performance with regards to the principal component analysis. Its positive features were verified in detail for the domain’s fundamental tasks of clustering, classification and detection of atypical elements (outliers).
Cechy publikacji
original-article
Inne
System-identifier
24518
CrossrefMetadata from Crossref logo
Cytowania
Liczba prac cytujących tę pracę
Brak danych
Referencje
Liczba prac cytowanych przez tę pracę
Brak danych