An Examination of the Effect of Six Types of Error Perturbation on Fifteen Clustering Algorithms | Psychometrika | Cambridge Core (original) (raw)

Article contents

Abstract

An evaluation of several clustering methods was conducted. Artificial clusters which exhibited the properties of internal cohesion and external isolation were constructed. The true cluster structure was subsequently hidden by six types of error-perturbation. The results indicated that the hierarchical methods were differentially sensitive to the type of error perturbation. In addition, generally poor recovery performance was obtained when random seed points were used to start the K-means algorithms. However, two alternative starting procedures for the nonhierarchical methods produced greatly enhanced cluster recovery and were found to be robust with respect to all of the types of error examined.

Keywords

Information

Type

Original Paper

Copyright

Copyright © 1980 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Dudewicz, E. J. IRCCRAND-The Ohio State University random number generator package, 1974, Columbus, Ohio: The Ohio State University, Department of Statistics.Google Scholar

Learmonth, G. P., &Lewis, P. A. W. Naval Postgraduate School random number generator package LLRANDOM, 1973, Monterey, Calif.: Naval Postgraduate School, Department of Operations Research and Administrative Sciences.Google Scholar

Anderberg, M. R. Cluster analysis for applications, 1973, New York: Academic Press.Google Scholar

Baker, F. B. Stability of two hierarchical grouping techniques Case I: Sensitivity to data errors. Journal of the American Statistical Association, 1974, 69, 440–445.Google Scholar

Bartko, J. J., Straus, J. S., & Carpenter, W. T. An evaluation of taxometric techniques for psychiatric data. Classification Society Bulletin, 1971, 2, 2–28.Google Scholar

Blashfield, R. K. Mixture model tests of cluster analysis: Accuracy of four agglomerative hierarchical methods. Psychological Bulletin, 1976, 83, 377–388.CrossRefGoogle Scholar

Bromley, D. B. Rank order cluster analysis. British Journal of Mathematical and Statistical Psychology, 1966, 19, 105–123.CrossRefGoogle ScholarPubMed

Cattel, R. B. r p and other coefficients of pattern similarity. Psychometrika, 1949, 14, 279–298.CrossRefGoogle Scholar

Cormack, R. M. A review of classification. Journal of the Royal Statistical Society (Series A), 1971, 134, 321–367.CrossRefGoogle Scholar

Cronbach, L. J., & Gleser, G. C. Assessing the similarity between profiles. Psychological Bulletin, 1953, 50, 456–473.CrossRefGoogle ScholarPubMed

Cunningham, K. M. & Ogilvie, J. C. Evaluation of hierarchical grouping techniques: A preliminary study. Computer Journal, 1972, 15, 209–213.CrossRefGoogle Scholar

Dudewicz, E. J. Speed and quality of random numbers for simulation. Journal of Quality Technology, 1976, 8, 171–178.CrossRefGoogle Scholar

Edelbrock, C. Comparing the accuracy of hierarchical clustering algorithms: The problem of classifying everybody. Multivariate Behavioral Research, 1979, 14, 367–384.CrossRefGoogle ScholarPubMed

Everitt, B. S. Cluster analysis, 1974, London: Halstead Press.Google Scholar

Fleiss, L., Zubin, J. On the methods and theory of clustering. Multivariate Behavioral Research, 1969, 4, 235–250.CrossRefGoogle ScholarPubMed

Friedman, H. P. & Rubin, J. On some invariant criteria for grouping data. Journal of the American Statistical Association, 1967, 62, 1159–1178.CrossRefGoogle Scholar

Hartigan, J. A. Clustering algorithms, 1975, New York: Wiley.Google Scholar

Helmstadter, G. An empirical comparison of methods for estimating profile similarity. Educational and Psychological Measurement, 1957, 17, 71–82.CrossRefGoogle Scholar

Hubert, L. J. & Levin, J. R. Evaluating object set partitions: Free sort analysis and some generalizations. Journal of Verbal Learning and Verbal Behavior, 1976, 15, 459–470.CrossRefGoogle Scholar

Jardine, N., Sibson, R. Mathematical taxonomy, 1971, New York: Wiley.Google Scholar

Kuiper, F. K. & Fisher, L. A Monte Carlo comparison of six clustering procedures. Biometrics, 1975, 31, 777–783.CrossRefGoogle Scholar

Levinsohn, J. R. & Funk, S. G. CLUSTER-Hierarchical clustering program for large data sets (N greater than 100). Behavior Research Methods and Instrumentation, 1973, 5, 432–432.CrossRefGoogle Scholar

Mezich, J. E. An evaluation of quantitative taxonomic methods (Doctral dissertation, The Ohio State University, 1975). Dissertation Abstracts International, 1975, 36, 3008-B. (University Microfilms No. 75-26, 616).Google Scholar

Milligan, G. W. An examination of the effect of error perturbation of constructed data on fifteen clustering algorithms (Doctoral dissertation, The Ohio State University, 1978). Dissertation Abstracts International, 1979, 40, 4010B–4011B (University Microfilms No. 7902188)Google Scholar

Milligan, G. W. Ultrametric hierarchical clustering algorithms. Psychometrika, 1979, 44, 343–346.CrossRefGoogle Scholar

Milligan, G. W. & Isaac, P. D. The validation of four ultrametric clustering algorithms. Pattern Recognition, 1980, 12, 41–50.CrossRefGoogle Scholar

Rand, W. M. Objective criteria for the evaluation of clustering methods. Journal of the American Statistical Association, 1971, 66, 846–850.CrossRefGoogle Scholar

Rohlf, F. J. Methods of comparing classifications. Annual Review of Ecology and Systematics, 1974, 5, 101–113.CrossRefGoogle Scholar

Shepard, R. N. Representation of structure in similarity data: Problems and prospects. Psychometrika, 1974, 39, 373–421.CrossRefGoogle Scholar

Sneath, P. H. A. A comparison of different clustering methods as applied to randomly-spaced points. Classification Society Bulletin, 1966, 1, 2–18.Google Scholar

Sneath, P. H. A. Evaluation of clustering methods. In Cole, A. J. (Eds.), Numerical taxonomy, 1969, New York: Academic Press.Google Scholar

Sneath, P. H. A. & Sokal, R. R. Numerical taxonomy, 1973, San Francisco: Freeman.Google Scholar

Williams, W. T., Lance, G. N., Dale, M. B. & Clifford, H. T. Controversy concerning the criteria for taxonometric strategies. Computer Journal, 1971, 14, 162–165.CrossRefGoogle Scholar

Zahn, C. T. Graph theory methods for detecting and describing Gestalt clusters. IEEE Transactions on Computers, 1971, C-20, 68–86.CrossRefGoogle Scholar