Automatic clustering and boundary detection algorithm based on adaptive influence function (original) (raw)
2008, Pattern Recognition
Clustering became a classical problem in databases, data warehouses, pattern recognition, artificial intelligence, and computer graphics. Applications in large spatial databases, point-based graphics, etc., give rise to new requirements for the clustering algorithms: automatic discovering of arbitrary shaped and/or non-homogeneous clusters, discovering of clusters located in low-dimensional hyperspace, detecting cluster boundaries. On that account, a new clustering and boundary detecting algorithm, ADACLUS, is proposed. It is based on the specially constructed adaptive influence function, and therefore, discovers clusters of arbitrary shapes and diverse densities, adequately captures clusters boundaries, and it is robust to noise. Normally ADACLUS performs clustering purely automatically without any preliminary parameter settings. But it also gives the user an optional possibility to set three parameters with clear meaning in order to adjust clustering for special applications. The algorithm was tested on various two-dimensional data sets, and it exhibited its effectiveness in discovering clusters of complex shapes and diverse densities. Linear complexity of the ADACLUS gives it an advantage over some well-known algorithms.