×

Serwis używa ciasteczek ("cookies") i podobnych technologii m.in. do utrzymania sesji i w celach statystycznych. • Ustawienia przeglądarki dotyczące obsługi ciasteczek można swobodnie zmieniać. • Całkowite zablokowanie zapisu ciasteczek na dysku komputera uniemożliwi logowanie się do serwisu. • Więcej informacji: Polityka cookies OPI PIB

×

Regulamin korzystania z serwisu PBN znajduję się pod adresem: Regulamin serwisu

Szukaj wśród:
Dane publikacji

Improvement of new automatic differential fuzzy clustering using SVM classifier for microarray analysis

Artykuł
Czasopismo : EXPERT SYSTEMS WITH APPLICATIONS   Tom: 38, Zeszyt: 12, Strony: 15122-15133
Indrajit Saha , Ujjwal Maulik , Sanghamitra Bandyopadhyay , Dariusz Plewczyński [1]
2011 angielski
Identyfikatory
-
Słowa kluczowe
-
Abstrakty ( angielski )
-
In recent year, the problem of clustering in microarray data has been gaining significant attention. However most of the clustering methods attempt to find the group of genes where the number of cluster is known a priori. This fact motivated us to develop a new real-coded improved differential evolution based automatic fuzzy clustering algorithm which automatically evolves the number of clusters as well as the proper partitioning of a gene expression data set. To improve the result further, the clustering method is integrated with a support vector machine, a well-known technique for supervised learning. A fraction of the gene expression data points selected from different clusters based on their proximity to the respective centers, is used for training the SVM. The clustering assignments of the remaining gene expression data points are thereafter determined using the trained classifier. The performance of the proposed clustering technique has been demonstrated on five gene expression data sets by comparing it with the differential evolution based automatic fuzzy clustering, variable length genetic algorithm based fuzzy clustering and well known Fuzzy C-Means algorithm. Statistical significance test has been carried out to establish the statistical superiority of the proposed clustering approach. Biological significance test has also been carried out using a web based gene annotation tool to show that the proposed method is able to produce biologically relevant clusters of genes. The processed data sets and the matlab version of the software are available at http://bio.icm.edu.pl/∼darman/IDEAFC-SVM/.
Bibliografia
-
  1. Alizadeh, A.A.& Eisen, M.B.& Davis, R.& Ma, C.& Lossos, I.& Rosenwald, A. et al., "Distinct types of diffuse large b-cell lymphomas identified by gene expression profiling", Nature, vol. 403, 2000, p.503-511
  2. Bandyopadhyay, S., "An efficient technique for superfamily classification of amino acid sequences: Feature extraction, fuzzy clustering and prototype selection", Fuzzy Sets and Systems, vol. 152, 2005, p.5-16
  3. Bandyopadhyay, S.& Maulik, U., "Nonparametric genetic clustering: Comparison of validity indices", IEEE Transactions on Systems, Man, and Cybernetics, Part C, vol. 31, 1, 2001, p.120-125
  4. Bandyopadhyay, S.& Maulik, U.& Wang, J.T., "Analysis of biological data: A soft computing approach", 2007
  5. Bandyopadhyay, S.& Pal, S.K., "Pixel classification using variable string genetic algorithms with chromosome differentiation", IEEE Transactions on Geoscience and Remote Sensing, vol. 39, 2, 2001, p.303-308
  6. Bandyopadhyay, S.& Saha, S.& Maulik, U.& Deb, K., "A simulated annealing based multi-objective optimization algorithm: AMOSA", IEEE Transactions on Evolutionary Computation, vol. 12, 3, 2008, p.269-283
  7. Bezdek, J.C., "Pattern recognition with fuzzy objective function algorithms", 1981
  8. Bezdek, J.C.& Pal, N.R., "Some new indexes of cluster validity", IEEE Transactions on Systems, Man and Cybernetics, vol. 28, 3, 1998, p.301-315
  9. Bras Silva, H.& Brito, P.& Pinto da Costa, J., "A partitional clustering algorithm validated by a clustering tendency index based on graph theory", Pattern Recognition, vol. 39, 5, 2006, p.776-788
  10. Burges, C.L.C., "A tutorial on support vector machines for pattern recognition", Data Mining and Knowledge Discovery, vol. 2, 1998, p.121-167
  11. Cho, R.J.& Campbell, M.J.& Winzeler, E.A.& Steinmetz, L.& Conway, A.& Wodica, L. et al., "A genome-wide transcriptional analysis of mitotic cell cycle", Molecular Cell, vol. 2, 1998, p.65-73
  12. Chou, H.C.& Su, M.C.& Lai, E., "A new cluster validity measure and its application to image compression", Pattern Analysis and Applications, vol. 7, 2004, p.205-220
  13. Chu, S.& DeRisi, J.& Eisen, M.& Mulholland, J.& Botstein, D.& Brown, P.O. et al., "The transcriptional program of sporulation in budding yeast", Science, vol. 282, 1998, p.699-705
  14. Domany, E., "Cluster analysis is of gene expression data", Journal of Statistical Physics, vol. 110, 3–6, 2003, p.1117-1139
  15. Eisen, M.B.& Spellman, P.T.& Brown, P.O.& Botstein, D., "Cluster analysis and display of genome-wide expression patterns", Proceedings of the National Academy of Sciences, 1998, p.14863-14868
  16. Everitt, B.S., "Cluster analysis", 1993, 3rd ed.
  17. Gath, I.& Geva, A., "Unsupervised optimal fuzzy clustering", IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 11, 1989, p.773-781
  18. Groll, L.& Jakel, J., "A new convergence proof of fuzzy c-means", IEEE Transactions on Fuzzy System, vol. 13, 2005, p.717-720
  19. Hartigan, J.A., "Clustering algorithms", 1975
  20. Hollander, M., & Wolfe, D. A. (1999). Nonparametric Statistical Methods, 2nd Ed.
  21. Iyer, V.R.& Eisen, M.B.& Ross, D.T.& Schuler, G.& Moore, T.& Lee, J. et al., "The transcriptional program in the response of the human fibroblasts to serum", Science, vol. 283, 1999, p.83-87
  22. Jain, A.K.& Dubes, R.C., "Algorithms for clustering data", 1988
  23. Kim, S.Y.& Lee, J.W.& Bae, J.S., "Effect of data normalization on fuzzy clustering of DNA microarray data", BMC Bioinformatics, vol. 7, 134, 2006
  24. Kim, M.& Ramakrishna, R., "New indices for cluster validity assessment", Pattern Recognition Letters, vol. 26, 15, 2005, p.2353-2363
  25. Krishnapuram, R.& Freg, C.P., "Fitting an unknown number of lines and planes to image data through compatible cluster merging", Pattern Recognition, vol. 25, 4, 1992, p.433-439
  26. Lockhart, D.J.& Winzeler, E.A., "Genomics, gene expression and DNA arrays", Nature, vol. 405, 2000, p.827-836
  27. Maulik, U.& Bandyopadhyay, S., "Performance evaluation of some clustering algorithms and validity indices", IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, 12, 2002, p.1650-1654
  28. Maulik, U.& Bandyopadhyay, S., "Performance evaluation of some clustering algorithms and validity indices", IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, 12, 2002, p.1650-1654
  29. Maulik, U.& Bandyopadhyay, S., "Fuzzy partitioning using a real-coded variable-length genetic algorithm for pixel classification", IEEE Transactions on Geoscience and Remote Sensing, vol. 41, 5, 2003, p.1075-1081
  30. Maulik, U.& Bandyopadhyay, S.& Saha, I., "Integrating clustering and supervised learning for categorical data analysis", IEEE Transactions on Systems, Man and Cybernetics Part-A, 2009
  31. Maulik, U.& Saha, I., "Modified differential evolution based fuzzy clustering for pixel classification in remote sensing imagery", Pattern Recognition, vol. 42, 9, 2009, p.2135-2149
  32. Omran, M., Engelbrecht, A., & Salman, A. (2005). Differential evolution methods for unsupervised image classification. In Proceedings of IEEE international conference on evolutionary computation (Vol. 2, pp. 966–973).
  33. Pal, N.R.& Bezdek, J.C., "On cluster validity for the Fuzzy C-Means model", IEEE Transactions on Fuzzy Systems, vol. 3, 1995, p.370-379
  34. Price, K.& Storn, R.& Lampinen, J., "Differential evolution – A practical approach to global optimization", 2005
  35. Reymonda, P.& Webera, H.& Damonda, M.& Farmera, E.E., "Differential gene expression in response to mechanical wounding and insect feeding in arabidopsis", Plant Cell, vol. 12, 2000, p.707-720
  36. Rousseeuw, P.J., "Silhouettes: A graphical aid to the interpretation and validation of cluster analysis", Journal of Computational and Applied Mathematics, vol. 20, 1987, p.53-65
  37. Shannon, W.& Culverhouse, R.& Duncan, J., "Analyzing microarray data using cluster analysis", Pharmacogenomics, vol. 4, 1, 2003, p.41-51
  38. Sharan, R.& Maron-Katz, A.& Shamir, R., "Click and expander: A system for clustering and visualizing gene expression data", Bioinformatics, vol. 19, 2003, p.1787-1799
  39. Storn, R., & Price, K. (1995). Differential evolution – A simple and efficient adaptive scheme for global optimization over continuous spaces, Technical Report TR-95-012, International Computer Science Institute, Berkley.
  40. Storn, R.& Price, K., "Differential evolution – A simple and efficient heuristic strategy for global optimization over continuous spaces", Journal of Global Optimization, vol. 11, 1997, p.341-359
  41. Tavazoie, S.& Hughes, J.& Campbell, M.& Cho, R.& Church, G., "Systematic determination of genetic network architecture", Nature Genetics, vol. 22, 1999, p.281-285
  42. Vapnik, V., "Statistical learning theory", 1998
  43. Wang, W.& Zhang, Y., "On fuzzy cluster validity indices", Fuzzy Sets and Systems, vol. 158, 19, 2007, p.2095-2117
  44. Wen, X.& Fuhrman, S.& Michaels, G.S.& Carr, D.B.& Smith, S.& Barker, J.L. et al., "Large-scale temporal gene expression mapping of central nervous system development", Proceedings of the National Academy of Sciences, vol. 95, 1998, p.334-339
  45. Xie, X.L.& Beni, G., "A validity measure for fuzzy clustering", IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 13, 1991, p.841-847
  46. Xu, Y.& Olman, V.& Xu, D., "Minimum spanning trees for gene expression data clustering", Genome Informatics, vol. 12, 2001, p.24-33
Zacytuj dokument
-