CCRP: Converging Credit-Based and Reactive Protocols in Datacenters (original) (raw)
Abstract
As the link speed has grown steadily from 10 Gbps to 100 Gbps, high-speed data center networks (DCNs) require more efficient congestion management. Therefore, proactive transports, especially credit-based congestion control, nowadays have drawn much attention because of fast convergence, near-zero queueing and low latency. However, in real deployment scenarios, it is hard to guarantee one protocol to be deployed in every host at one time. Thus, when the credit-based protocols are deployed into DCNs incrementally, the network will convert to multi-protocol state and face the following fundamental challenges: (i) unfairness, (ii) non-convergence, and (iii) high buffer occupancy. In this paper, we propose a new protocol, called CCRP, aiming for converging credit-based and reactive protocols in data centers. Targeting the mostly deployed protocol, i.e. DCQCN based on explicit congestion notification (ECN), in DCNs, CCRP leverages the forward ECN to detect the network congestion in data queue and optimizes feedback control of the credit-based transports. Our experiment results show that this design can address the unfair link allocation and converge with reactive protocols rapidly. Furthermore, CCRP achieves high utilization and low buffer occupancy at the same time.
Access this article
Subscribe and save
- Starting from 10 chapters or articles per month
- Access and download chapters and articles from more than 300k books and 2,500 journals
- Cancel anytime View plans
Buy Now
Price excludes VAT (USA)
Tax calculation will be finalised during checkout.
Instant access to the full article PDF.
Similar content being viewed by others
References
- Jose, L., et al.: High speed networks need proactive congestion control. In: Proceedings of HotNets, pp. 1–7. (2015). https://doi.org/10.1145/2834050.2834096
- Singh, A., et al.: Jupiter rising: a decade of clos topologies and centralized control in google’s datacenter network. Commun. ACM 45, 188–197 (2016). https://doi.org/10.1145/2785956.2787508
Article Google Scholar - Wilson, C., et al.: Better never than late: meeting deadlines in datacenter networks. In: Proceedings of SIGCOMM, pp. 50–61. (2011). https://doi.org/10.1145/2018436.2018443
- Wu, H., et al.: ICTCP: Incast Congestion Control for TCP in Data-Center Networks. In: Proceedings of CoNEXT, pp. 1–12. (2010). https://doi.org/10.1145/1921168.1921186
- Eran, H., et al.: Congestion control for large-scale RDMA deployments. In: Proceedings of SIGCOMM, pp. 523–536. (2015). https://doi.org/10.1145/2785956.2787484
- Alizadeh, M., et al.: Data center TCP (DCTCP). In: Proceedings of SIGCOMM, pp. 63–74. (2010). https://doi.org/10.1145/1851182.1851192
- Mittal, R., et al.: Timely: RTT-based congestion control for the datacenter. In: Proceedings of SIGCOMM, pp. 537–550. (2015). https://doi.org/10.1145/2785956.2787510
- Hong, C., et al. Finishing Flows Quickly with Preemptive Scheduling. In: Proceedings of SIGCOMM, pp. 127–138. (2015). https://doi.org/10.1145/2377677.2377710
- Gao, P., et al. phost: Distributed near-optimal datacenter transport over commodity network fabric. In: Proceedings of CoNEXT, pp. 1–12. (2015). https://doi.org/10.1145/2716281.2836086
- Perry, J., et al.: Fastpass: A centralized-zero-queue datacenter network. In: Proceedings of SIGCOMM, pp. 307– 318. (2014). https://doi.org/10.1145/2619239.2626309
- Cho, I., et al.: Credit-scheduled delay-bounded congestion control for datacenters. In: Proceedings of SIGCOMM, pp. 239–252. (2017). https://doi.org/10.1145/3098822.3098840
- Jiang, N., et al.: Network congestion avoidance through speculative reservation. In: Proceedings of HPCA, pp. 1–12. (2012). https://doi.org/10.1109/HPCA.2012.6169047
- Montazeri, B., et al.: Homa: A receiver-driven low-latency transport protocol using network priorities. In: Proceedings of SIGCOMM, pp. 221–235. (2018). https://doi.org/10.1145/3230543.3230564
- Zhang, Y., et al.: BDS: a centralized near-optimal overlay network for inter-datacenter data replication. In: Proceedings of EuroSys, pp.1–14. (2018). https://doi.org/10.1145/3190508.3190519
- Mittal, R., et al.: Revisiting network support for RDMA. In: Proceedings of SIGCOMM, pp. 313–326. (2018). https://doi.org/10.1145/3230543.3230557
- Michelogiannakis, G., et al.: Channel reservation protocol for over-subscribed channels and destinations. In: Proceedings of HPCA, pp. 52:1–52:12. (2013). https://doi.org/10.1145/2503210.2503213
- Nan, J., et al.: Network endpoint congestion control for fine-grained communication. In: Proceedings of SC, pp. 35:1–35:12. (2015). https://doi.org/10.1145/2807591.2807600
- Judd, G., et al.: Attaining the promise and avoiding the pitfalls of TCP in the datacenter. In: Proceedings of NSDI, pp. 145–157. (2015). https://doi.org/10.5555/2789770.2789781
- He, K., et al.: AC/DC TCP: Virtual congestion control enforcement for datacenter networks. In: Proceedings of SIGCOMM, pp. 244–257. (2016). https://doi.org/10.1145/2934872.2934903
- Alizadeh, M., et al.: pFabric: minimal near-optimal datacenter transport. In: Proceedings of SIGCOMM, pp. 435–446. (2013). https://doi.org/10.1145/2486001.2486031
- Fall, K., et al.: Simulation-based comparisons of (tahoe, reno and sack tcp). In: Proceedings of SIGCOMM, pp. 5–21. (1996). https://doi.org/10.1145/235160.235162
- Zats, D., et al.: DeTail: reducing the flow completion time tail in datacenter networks. In: Proceedings of SIGCOMM, pp. 139–150. (2012). https://doi.org/10.1145/2377677.2377711
- Lee, C., et al. Accurate Latency-based Congestion Feedback for Datacenters. In: Proceedings of USENIX ATC, pp. 403–415. (2015). https://doi.org/10.1109/TNET.2016.2587286
- Ha, S., et al.: CUBIC: a new TCP-friendly high-speed TCP variant. ACM SIGOPS Op. Syst. Rev. 42, 64–74 (2008). https://doi.org/10.1145/1400097.1400105
Article Google Scholar - Hu, S., et al.: Augmenting proactive congestion control with aeolus. In: Proceedings of APNet, pp. 22–28. (2018). https://doi.org/10.1145/3232565.3232567
- OMNeT++: discrete event simulator. http://omnetpp.org/
- INET Framework. https://inet.omnetpp.org/
- Varga, A., et al.: An overview of the OMNeT++ simulation environment. In: Proceedings of SIMUTools, pp. 1–10. (2008). https://doi.org/10.1145/1416222.1416290
- Kung, H., et al.: Credit-based flow control for ATM networks: credit update protocol, adaptive credit allocation, and statistical multiplexing. In: Proceedings of SIGCOMM, pp. 101–114. (1994). https://doi.org/10.1145/190314.190324
- Yang, X., et al.: A dos-limiting network architecture. In: Proceedings of SIGCOMM, pp. 241–252. (2005). https://doi.org/10.1145/1080091.1080120
Acknowledgements
We would like to thank the anonymous reviewers for their insightful comments. We gratefully acknowledge members of Tianhe interconnect group at NUDT for many inspiring conversations. The work was supported by the National Key R&D Program of China under Grant No. 2018YFB0204300.
Author information
Authors and Affiliations
- National University of Defense Technology, Changsha, 410073, China
Yang Bai, Dinghuang Hu, Dezun Dong, Shan Huang & Xiangke Liao
Authors
- Yang Bai
- Dinghuang Hu
- Dezun Dong
- Shan Huang
- Xiangke Liao
Corresponding author
Correspondence toDezun Dong.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Bai, Y., Hu, D., Dong, D. et al. CCRP: Converging Credit-Based and Reactive Protocols in Datacenters.Int J Parallel Prog 49, 685–699 (2021). https://doi.org/10.1007/s10766-021-00698-y
- Received: 10 November 2020
- Accepted: 27 February 2021
- Published: 21 May 2021
- Version of record: 21 May 2021
- Issue date: October 2021
- DOI: https://doi.org/10.1007/s10766-021-00698-y