Handling of incomplete data sets using ICA and SOM in data mining (original) (raw)
Abstract
Based on independent component analysis (ICA) and self-organizing maps (SOM), this paper proposes an ISOM-DH model for the incomplete data’s handling in data mining. Under these circumstances the data remain dependent and non-Gaussian, this model can make full use of the information of the given data to estimate the missing data and can visualize the handled high-dimensional data. Compared with mixture of principal component analyzers (MPCA), mean method and standard SOM-based fuzzy map model, ISOM-DH model can be applied to more cases, thus performing its superiority. Meanwhile, the correctness and reasonableness of ISOM-DH model is also validated by the experiment carried out in this paper.
Access this article
Subscribe and save
- Starting from 10 chapters or articles per month
- Access and download chapters and articles from more than 300k books and 2,500 journals
- Cancel anytime View plans
Buy Now
Price excludes VAT (USA)
Tax calculation will be finalised during checkout.
Instant access to the full article PDF.
Similar content being viewed by others
References
- Kohonen T (1982) Self-organized formation of topologically correct feature maps. Biol Cybern 43:59–69
Article MATH MathSciNet Google Scholar - Wang S (2003) Application of self-organising maps for data mining with incomplete data sets. Neural Comput Appl 12:42–48
Article Google Scholar - Chang P-C, Lai C-Y (2005) A hybrid system combining self-organizing maps with case-based reasoning in wholesaler’s new-release book for forecasting. Expert Syst Appl 29:183–192
Article Google Scholar - Oba S et al (2002) Missing value estimation using mixture of PCAs. LNCS 2415, pp 492–497
- Ad Feelders (1999) Handling missing data in trees-surrogate splits or statistical imputation. LNAI 1704, pp 329–334
Google Scholar - Grzymala-Busse JW (2004) Rough set approach to incomplete data. LNAI 3070, pp 50–55
- Gerardo BD et al (2004) The association rule algorithm with missing data in data mining. LNCS3043, pp 97–105
- Li D et al (2004) Towards missing data imputation—a study of fuzzy K-means clustering method. LNAI 3066, pp 573–579
Google Scholar - Zs. J. Viharos et al (2002) Training and application of artificial neural networks with incomplete data. LNAI 2358, pp 649–659
- Latkowski R (2002) Incomplete data decomposition for classification. LNAI 2475, pp 413–420
Google Scholar - Jutten C, Herault J (1998) Independent component analysis versus PCA. In: Proceeding of European signal processing conference, 287–314
- Singh Y, Rai CS (2003) A simplified approach to independent component analysis. Neural Comput Appl 12:173–177
Article Google Scholar - Kocsor A, Csirik J (2001) Fast independent component analysis in kernel feature spaces. LNCS 2234, pp 271–281
Google Scholar - Theis FJ et al (2002) Overcomplete ICA with a geometric algorithm. LNCS 2415, pp 1049–1054
- Vapnik V (2004) Statistical learning theory. Publishing House of Electronics Industry, Beijing
Google Scholar
Author information
Authors and Affiliations
- Department of Applied Mathematics, Sun Yat-sen University, Guangzhou, 510275, China
Hongyi Peng & Siming Zhu
Authors
- Hongyi Peng
- Siming Zhu
Corresponding author
Correspondence toHongyi Peng.
Rights and permissions
About this article
Cite this article
Peng, H., Zhu, S. Handling of incomplete data sets using ICA and SOM in data mining.Neural Comput & Applic 16, 167–172 (2007). https://doi.org/10.1007/s00521-006-0058-6
- Received: 02 September 2005
- Accepted: 24 April 2006
- Published: 30 May 2006
- Issue date: February 2007
- DOI: https://doi.org/10.1007/s00521-006-0058-6