Towards a Bayesian Statistical Model for the Classification of the Causes of Data Loss (original) (raw)

Abstract

Given the critical nature of communications in computational Grids it is important to develop efficient, intelligent, and adaptive communication mechanisms. An important milestone on this path is the development of classification mechanisms that can distinguish between the various causes of data loss in cluster and Grid environments. The idea is to use the classification mechanism to determine if data loss is caused by contention within the network or if the cause lies outside of the network domain. If it is outside of the network domain, then it is not necessary to trigger aggressive congestion-control mechanisms. Thus the goal is to operate the data transfer at the highest possible rate by only backing off aggressively when the data loss is classified as being network related. In this paper, we investigate one promising approach to developing such classification mechanisms based on the analysis of the patterns of packet loss and the application of Bayesian statistics**.**

Preview

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. The TeraGrid Homepage, http://www.teragrid.org
  2. Allcock, W., Bester, J., Breshahan, J., Chervenak, A., Foster, I., Kesselman, C., Meder, S., Nefedova, V., Quesnel, D., Tuecke, S.: Secure, Efficient Data Transport and Rep-lica Management for High-Performance Data_Intensive Computing. In: The Proceedings of IEEE Mass Storage Conference (2001)
    Google Scholar
  3. Allman, M., Paxson, V., Stevens, W.: TCP Congestion Control, RFC 2581 (April 1999)
    Google Scholar
  4. Balakrishnan, S., Padmanabhan, V., Seshan, S., Katz, R.: A Comparison of Mecha-nisms for Improving TCP Performance Over Wireless Links. IEEE/ACM Transactions of Networking 5(6), 756–769
    Google Scholar
  5. Balakrishnan, S., Seshan, S., Amir, E., Katz, R.: Improving TCP/IP performance over wireless networks. In: The Proceedings of ACM MOBICON (November 1995)
    Google Scholar
  6. Bardsley, W.: SimFit: A Package for Simulation, Curve Fitting, Graph Plotting and Statis-tical Analysis, http://www.simfit.man.ac.uk
  7. Barman, D., Matta, I.: Effectiveness of Loss Labeling in Improving TCP Performance in Wired/Wireless Networks. In: The Proceedings of ICNP 2002: The 10th IEEE Interna-tional Conference on Network Protocols, Paris, France (November 2002)
    Google Scholar
  8. Biaz, S., Vaidya, N.: Discriminating Congestion Losses from Wireless Losses using Inter-Arrival Times at the Receiver. In: The Proceedings of IEEE Symposium ASSET 1999, Richardson, TX (March 1999)
    Google Scholar
  9. Biaz, S., Vaidya, N.: Performance of TCP Congestion Predictors as Loss Predictors. Texas A&M University, Department of Computer Science Technical Report 98-007 February (1998)
    Google Scholar
  10. D’Alessandro, G., Politi, A.: Hierarchical Approach to Complexity with Applications to Dynamical Systems. Physical Review Letters, 64(14) 1609-1612
    Google Scholar
  11. Dickens, P.: FOBS: A Lightweight Communication Protocol for Grid Computing. In: The Proceedings of Europar 2003 (2003)
    Google Scholar
  12. Dickens, P.: A High Performance File Transfer Mechanism for Grid Computing. In: The Proceedings of The, Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA), Las Vegas, Nevada (2002)
    Google Scholar
  13. Dickens, P., Gropp, B.: An Evaluation of Object-Based Data Transfers Across High Performance High Delay Networks. In: The Proceedings of the 11th Conference on High Performance Distributed Computing, Edinburgh, Scotland (2002)
    Google Scholar
  14. Dickens, P., Gropp, B., Woodward, P.: High Performance Wide Area Data Transfers Over High Performance Networks. In: The Proceedings of The 2002 International Work-shop on Performance Modeling, Evaluation, and Optimization of Parallel and Distributed Systems (2002)
    Google Scholar
  15. Dickens, P., Larson, J.: Classifiers for Causes of Data Loss Using Packet-Loss Signatures. In: The Proceedings of IEEE Symposium on Cluster Computing and the Grid(ccGrid 2004) (2004)
    Google Scholar
  16. Dickens, P., Larson, J., Nicol, D.: Diagnostics for Causes of Packet Loss in a High Performance Data Transfer System. In: The Proceedings of Proceedings of 2004 IPDPS Conference: The 18th International Parallel and Distributed Processing Symposium, Santa Fe, New Mexico (2004)
    Google Scholar
  17. Elsner, J., Tsonis, A.: Complexity and Predictability of Hourly Precipitation. Journal of the Atmospheric Sciences 50(3), 400–405
    Google Scholar
  18. Hao, B.-l.: Elermentary Symbolic Dynamics and Chaos in Dissipative Systems. World Scientific, Singapore (1989)
    Google Scholar
  19. Liu, J., Matta, I., Crovella, M.: End-to-End Inference of Loss Nature in a Hybrid Wired/Wireless Environment. In: The Proceedings of Modeling and Optimization in Mo-bile, Ad Hoc, and Wireless Networks (WiOpt 2003), Sophia-Antipolis, France (2003)
    Google Scholar
  20. LMbench, http://www.bitmover.com/lmbench/
  21. Salamatian, K., Vaton, S.: Hidden Markov Modeling for Network Communication Channels. In: The Proceedings of ACM SIGMETRICS 2001 / Performance 2001 (Cam-bridge, Ma (June 2001)
    Google Scholar

Download references

Author information

Authors and Affiliations

  1. Department of Computer Science, University of Maine, Orono Maine, 04429
    Phillip M. Dickens
  2. Department of Mathematics and Computer Science, Longwood University, Farmville, Virginia
    Jeffery Peden

Authors

  1. Phillip M. Dickens
  2. Jeffery Peden

Editor information

Editors and Affiliations

  1. Department of Computer Science, St. Francis Xavier University, Antigonish, Canada
    Laurence T. Yang
  2. School of Computer Science/Welsh eScience Centre, Cardiff University, UK
    Omer F. Rana
  3. Dipartimento di Ingegneria dell’ Informazione - Second, University of Naples - Italy, Real Casa dell’Annunziata - via Roma, 29 81031, Aversa (CE), Italy
    Beniamino Di Martino
  4. Computer Science Department, University of Tennessee, 37996-3450, Knoxville, TN, USA
    Jack Dongarra

Rights and permissions

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Dickens, P.M., Peden, J. (2005). Towards a Bayesian Statistical Model for the Classification of the Causes of Data Loss. In: Yang, L.T., Rana, O.F., Di Martino, B., Dongarra, J. (eds) High Performance Computing and Communications. HPCC 2005. Lecture Notes in Computer Science, vol 3726. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11557654\_86

Download citation

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Publish with us