Towards a Bayesian Statistical Model for the Classification of the Causes of Data Loss (original) (raw)
Abstract
Given the critical nature of communications in computational Grids it is important to develop efficient, intelligent, and adaptive communication mechanisms. An important milestone on this path is the development of classification mechanisms that can distinguish between the various causes of data loss in cluster and Grid environments. The idea is to use the classification mechanism to determine if data loss is caused by contention within the network or if the cause lies outside of the network domain. If it is outside of the network domain, then it is not necessary to trigger aggressive congestion-control mechanisms. Thus the goal is to operate the data transfer at the highest possible rate by only backing off aggressively when the data loss is classified as being network related. In this paper, we investigate one promising approach to developing such classification mechanisms based on the analysis of the patterns of packet loss and the application of Bayesian statistics**.**
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
- The TeraGrid Homepage, http://www.teragrid.org
- Allcock, W., Bester, J., Breshahan, J., Chervenak, A., Foster, I., Kesselman, C., Meder, S., Nefedova, V., Quesnel, D., Tuecke, S.: Secure, Efficient Data Transport and Rep-lica Management for High-Performance Data_Intensive Computing. In: The Proceedings of IEEE Mass Storage Conference (2001)
Google Scholar - Allman, M., Paxson, V., Stevens, W.: TCP Congestion Control, RFC 2581 (April 1999)
Google Scholar - Balakrishnan, S., Padmanabhan, V., Seshan, S., Katz, R.: A Comparison of Mecha-nisms for Improving TCP Performance Over Wireless Links. IEEE/ACM Transactions of Networking 5(6), 756–769
Google Scholar - Balakrishnan, S., Seshan, S., Amir, E., Katz, R.: Improving TCP/IP performance over wireless networks. In: The Proceedings of ACM MOBICON (November 1995)
Google Scholar - Bardsley, W.: SimFit: A Package for Simulation, Curve Fitting, Graph Plotting and Statis-tical Analysis, http://www.simfit.man.ac.uk
- Barman, D., Matta, I.: Effectiveness of Loss Labeling in Improving TCP Performance in Wired/Wireless Networks. In: The Proceedings of ICNP 2002: The 10th IEEE Interna-tional Conference on Network Protocols, Paris, France (November 2002)
Google Scholar - Biaz, S., Vaidya, N.: Discriminating Congestion Losses from Wireless Losses using Inter-Arrival Times at the Receiver. In: The Proceedings of IEEE Symposium ASSET 1999, Richardson, TX (March 1999)
Google Scholar - Biaz, S., Vaidya, N.: Performance of TCP Congestion Predictors as Loss Predictors. Texas A&M University, Department of Computer Science Technical Report 98-007 February (1998)
Google Scholar - D’Alessandro, G., Politi, A.: Hierarchical Approach to Complexity with Applications to Dynamical Systems. Physical Review Letters, 64(14) 1609-1612
Google Scholar - Dickens, P.: FOBS: A Lightweight Communication Protocol for Grid Computing. In: The Proceedings of Europar 2003 (2003)
Google Scholar - Dickens, P.: A High Performance File Transfer Mechanism for Grid Computing. In: The Proceedings of The, Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA), Las Vegas, Nevada (2002)
Google Scholar - Dickens, P., Gropp, B.: An Evaluation of Object-Based Data Transfers Across High Performance High Delay Networks. In: The Proceedings of the 11th Conference on High Performance Distributed Computing, Edinburgh, Scotland (2002)
Google Scholar - Dickens, P., Gropp, B., Woodward, P.: High Performance Wide Area Data Transfers Over High Performance Networks. In: The Proceedings of The 2002 International Work-shop on Performance Modeling, Evaluation, and Optimization of Parallel and Distributed Systems (2002)
Google Scholar - Dickens, P., Larson, J.: Classifiers for Causes of Data Loss Using Packet-Loss Signatures. In: The Proceedings of IEEE Symposium on Cluster Computing and the Grid(ccGrid 2004) (2004)
Google Scholar - Dickens, P., Larson, J., Nicol, D.: Diagnostics for Causes of Packet Loss in a High Performance Data Transfer System. In: The Proceedings of Proceedings of 2004 IPDPS Conference: The 18th International Parallel and Distributed Processing Symposium, Santa Fe, New Mexico (2004)
Google Scholar - Elsner, J., Tsonis, A.: Complexity and Predictability of Hourly Precipitation. Journal of the Atmospheric Sciences 50(3), 400–405
Google Scholar - Hao, B.-l.: Elermentary Symbolic Dynamics and Chaos in Dissipative Systems. World Scientific, Singapore (1989)
Google Scholar - Liu, J., Matta, I., Crovella, M.: End-to-End Inference of Loss Nature in a Hybrid Wired/Wireless Environment. In: The Proceedings of Modeling and Optimization in Mo-bile, Ad Hoc, and Wireless Networks (WiOpt 2003), Sophia-Antipolis, France (2003)
Google Scholar - LMbench, http://www.bitmover.com/lmbench/
- Salamatian, K., Vaton, S.: Hidden Markov Modeling for Network Communication Channels. In: The Proceedings of ACM SIGMETRICS 2001 / Performance 2001 (Cam-bridge, Ma (June 2001)
Google Scholar
Author information
Authors and Affiliations
- Department of Computer Science, University of Maine, Orono Maine, 04429
Phillip M. Dickens - Department of Mathematics and Computer Science, Longwood University, Farmville, Virginia
Jeffery Peden
Authors
- Phillip M. Dickens
- Jeffery Peden
Editor information
Editors and Affiliations
- Department of Computer Science, St. Francis Xavier University, Antigonish, Canada
Laurence T. Yang - School of Computer Science/Welsh eScience Centre, Cardiff University, UK
Omer F. Rana - Dipartimento di Ingegneria dell’ Informazione - Second, University of Naples - Italy, Real Casa dell’Annunziata - via Roma, 29 81031, Aversa (CE), Italy
Beniamino Di Martino - Computer Science Department, University of Tennessee, 37996-3450, Knoxville, TN, USA
Jack Dongarra
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Dickens, P.M., Peden, J. (2005). Towards a Bayesian Statistical Model for the Classification of the Causes of Data Loss. In: Yang, L.T., Rana, O.F., Di Martino, B., Dongarra, J. (eds) High Performance Computing and Communications. HPCC 2005. Lecture Notes in Computer Science, vol 3726. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11557654\_86
Download citation
- .RIS
- .ENW
- .BIB
- DOI: https://doi.org/10.1007/11557654\_86
- Publisher Name: Springer, Berlin, Heidelberg
- Print ISBN: 978-3-540-29031-5
- Online ISBN: 978-3-540-32079-1
- eBook Packages: Computer ScienceComputer Science (R0)Springer Nature Proceedings Computer Science
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.