Scalable Clustering Using Graphics Processors (original) (raw)

Abstract

We present new algorithms for scalable clustering using graphics processors. Our basic approach is based on k-means. By changing the order of determining object labels, and exploiting the high computational power and pipeline of graphics processing units (GPUs) for distance computing and comparison, we speed up the k-means algorithm substantially. We introduce two strategies for retrieving data from the GPU, taking into account the low bandwidth from the GPU back to the main memory. We also extend our GPU-based approach to data stream clustering. We implement our algorithms in a PC with a Pentium IV 3.4G CPU and a NVIDIA GeForce 6800 GT graphics card. Our comprehensive performance study shows that the common GPU in desktop computers could be an efficient co-processor of CPU in traditional and data stream clustering.

Preview

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Aggarwal, C.C., Han, J., Wang, J., Yu, P.S.: A framework for clustering evolving data streams. In: Proc. of VLDB (2003)
    Google Scholar
  2. Babcock, B., Datar, M., Motwani, R., O’Callaghan, L.: Maintaining variance and k-medians over data stream windows. In: Proc. of PODS (2003)
    Google Scholar
  3. Baciu, G., Wong, S., Sun, H.: Recode: An image-based collision detection algorithm. Visualization and Computer Animation 10(4), 181–192 (1999)
    Article Google Scholar
  4. Ester, M., Kriegel, H.-P., Sander, J., Xu, X.: A density-based algorithm fordiscovering clusters in large spatial databases with noise. In: Proc. of KDD (1996)
    Google Scholar
  5. Govindaraju, N.K., Lloyd, B., Wang, W., Lin, M., et al.: Fast computation of database operations using graphics processors. In: Proc. of SIGMOD (2004)
    Google Scholar
  6. Govindaraju, N.K., Raghuvanshi, N., Manocha, D.: Fast and approximate stream mining of quantiles and frequencies using graphics processors. In: Proc. Of SIGMOD (2005)
    Google Scholar
  7. Guha, S., Meyerson, A., Mishra, N., Motwani, R., O’Callaghan, L.: Clustering data streams:theory and practice. In: IEEE TKDE, pp. 515–528 (2003)
    Google Scholar
  8. Guha, S., Rastogi, R., Shim, K.: Cure: An efficient clustering algorithm for large databases. In: Proc. of SIGMOD, pp. 73–84 (1998)
    Google Scholar
  9. Hall, J.D., Hart, J.C.: Gpu acceleration of iterative clustering. In: Proc. Of SIGGRAPH poster (2004)
    Google Scholar
  10. Hoff III, K.E., Keyser, J., Lin, M., Manocha, D., Culver, T.: Fast computation of generalized voronoi diagrams using graphics hardware. In: Proc. of SIGGRAPH, pp. 277–286 (1999)
    Google Scholar
  11. Jain, A., Dubes, R.: Algorithms for clustering data. New Jersey (1998)
    Google Scholar
  12. Larsen, E.S., McAllister, D.K.: Fast matrix multiplies using graphics hardware. In: Proc. of IEEE Supercomputing (2001)
    Google Scholar
  13. Sun, C., Agrawal, D., Abbadi, A.E.: Hardware acceleration for spatial selections and joins. In: Proc. of SIGMOD, pp. 455–466 (2003)
    Google Scholar
  14. Venkatasubramanian, S.: The graphics card as a stream computer. In: SIGMOD Workshop on Management and Processing of Data Streams (2003)
    Google Scholar
  15. Thompson, C.J., Hahn, S., Oskin, M.: Using modern graphics architectures for general-purpose computing: A framework and analysis. In: Proc. of IEEE/ACM International Symposium on Microarchitectures, pp. 306–317 (2002)
    Google Scholar
  16. Zhang, T., Ramakrishnan, R., Livny, M.: Birch: An efficient data clustering method for very large databases. In: Proc. of SIGMOD, pp. 103–114 (1996)
    Google Scholar

Download references

Author information

Authors and Affiliations

  1. Dept. of Computer Science and Engineering, Fudan University, China
    Feng Cao & Aoying Zhou
  2. School of Computing, National University of Singapore, Singapore
    Anthony K. H. Tung

Authors

  1. Feng Cao
  2. Anthony K. H. Tung
  3. Aoying Zhou

Editor information

Editors and Affiliations

  1. Chinese University of Hong Kong, Hong Kong, China
    Jeffrey Xu Yu
  2. Institute of Industrial Science, The University of Tokyo, 4-6-1 Komaba, Meguro-ku, 153-8505, Tokyo, Japan
    Masaru Kitsuregawa
  3. Department of Computing, Hong Kong Polytechnic University, Hong Kong
    Hong Va Leong

Rights and permissions

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Cao, F., Tung, A.K.H., Zhou, A. (2006). Scalable Clustering Using Graphics Processors. In: Yu, J.X., Kitsuregawa, M., Leong, H.V. (eds) Advances in Web-Age Information Management. WAIM 2006. Lecture Notes in Computer Science, vol 4016. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11775300\_32

Download citation

Publish with us