Transformer-based performance prediction and proactive resource allocation for cloud-native microservices (original) (raw)

References

Gannon, D., Barga, R.S., Sundaresan, N., Applications, C.-N.: Cloud-native applications. IEEE Cloud Comput. 4(5), 16–21 (2017)
Article Google Scholar
Luo, S., Huanle, Xu., Chengzhi, Lu., et al.: Characterizing microservice dependency and performance: Alibaba trace analysis. In ACM Symp. Cloud Comput. (SoCC) (2021). https://doi.org/10.1145/3472883.3487003
Article Google Scholar
Luo, S., Huanle, Xu., Chengzhi, Lu., et al.: An in-depth study of microservice call graph and runtime performance. IEEE Trans. Parallel Distrib. Syst. 33(12), 3901–3914 (2022)
Article Google Scholar
Gan, Yu., Zhang, Yanqi, Kelvin, Hu., et al.: Seer: leveraging big data to navigate the complexity of performance debugging in cloud microservices. In ACM Int. Conf. Archit. Support for Progr. Languages and Operating Syst. (ASPLOS) (2019). https://doi.org/10.1145/3297858.3304004
Article Google Scholar
Qiu, H., Banerjee, S.S., Jha, S., Kalbarczyk, Z.T., Iyer, R.K.: FIRM: an intelligent fine-grained resource management framework for SLO-oriented microservices, in USENIX symposium on operating systems design and implementations (OSDI), 805–825. (2020)
Li, Peng, Zhiyi, C., Chu, Xu., et al.: DiffPrep: differentiable data preprocessing pipeline search for learning over tabular data. Proc. ACM Manag. Data 1(2), 183–183 (2023)
Article Google Scholar
Liu, S., Jiang, C.: A novel prediction approach based on three-way decision for cloud datacenters. Appl. Intell. 53(17), 20239–20255 (2023)
Article Google Scholar
Ashish, V., Noam, S., Niki, P. et al., Attention is all you need, in conference and workshop on neural information processing systems (NeurIPS), 5998–6008 (2017)
Arik, S.Ö., Pfister, T.: TabNet: attentive interpretable tabular learning. AAAI 35, 6679–6687 (2021)
Article Google Scholar
Lillicrap, T.P., Hunt, J.J., Pritzel, A., et al.: Continuous control with deep reinforcement learning. In Int. Conf. Learn. Represent. (ICLR) (2016). https://doi.org/10.48550/arXiv.1509.02971
Article Google Scholar
Jaeger. Jaeger: open source, end-to-end distributed tracing. 2024; Available from: https://www.jaegertracing.io/.
Zhang, Y., Hua, W., Zhou, Z., et al.: Sinan: ML-based and QoS-aware resource management for cloud microservices. In ACM Int. Conf. Archit. Support for Prog. Languages and Operating Syst. (ASPLOS) (2021). https://doi.org/10.1145/3445814.3446693
Article Google Scholar
Gan, Y., Liang, M., Dev, S., et al.: 2021 Sage: practical and scalable ML-driven performance debugging in microservices. In ACM Int. Conf. Archit. Support for Progr. Languages and Operating Syst. (ASPLOS) (2021). https://doi.org/10.1145/3445814.3446700
Article Google Scholar
Yu, X., Joshi, P., Xu, J., et al.: CloudSeer: workflow monitoring of cloud infrastructures via interleaved logs. In ACM Int. Conf. Archit. Support for Progr. Languages and Operating Syst. (ASPLOS) (2016). https://doi.org/10.1145/2980024.2872407
Article Google Scholar
Kan, C.: DoCloud: an elastic cloud platform for Web applications based on Docker. In 2016 18th Int. Conf. Adv. Commun. Technol. (ICACT) (2016). https://doi.org/10.1109/ICACT.2016.7423440
Article Google Scholar
Imdoukh, M., Ahmad, I., AlFailakawi, M.G.: Machine learning-based auto-scaling for containerized applications. Neural Comput. Appl. 32(13), 9745–9760 (2020)
Article Google Scholar
Dang-Quang, N.-M., Yoo, M.: Deep learning-based autoscaling using bidirectional long short-term memory for kubernetes. Appl. Sci. 11(9), 3835 (2021)
Article Google Scholar
Cheng, K., Zhang, S., Chenghong, T., et al.: Proscale: proactive autoscaling for microservice with time-varying workload at the edge. IEEE Trans. Parallel and Distrib. Syst. (IEEE TPDS) 34(4), 1294–1312 (2023)
Article Google Scholar
Luo, S., Huanle, X., Ye, K., et al.: The power of prediction: microservice auto scaling via workload learning. In Proc. Symp. Cloud Comput. (ACM SoCC) (2022). https://doi.org/10.1145/3542929.35634774
Article Google Scholar
Wang, H., Wang, Y., Liang, G., et al.: Research on load balancing technology for microservice architecture. In: Barukčić, I. (ed.) MATEC web of conferences(MATEC Web Conf). EDP Sciences, Les Ulis (2021)
Google Scholar
Kubernetes. Autoscalers. 2024; Available from: https://kubernetes.io./.
Rzadca, K., Findeisen, P., Swiderski, J., et al.: Autopilot: workload autoscaling at google. In Eur. Conf. Comput. Syst. (EuroSys) (2020). https://doi.org/10.1145/3342195.3387524
Article Google Scholar
Gracia, V.M., Tolón, C., Arronategui, U., et al.: Client-side scheduling based on application characterization on kubernetes. In: Pham, C., Altmann, J., Bañares, J.Á. (eds.) International conference on the economics of grids, clouds, systems, and services (GECON). Springer International Publishing, Cham (2017)
Google Scholar
Samanta, A., Jiao, L., Mühlhäuser, M., et al.: Incentivizing microservices for online resource sharing in edge clouds. In Int. Conf. Distrib. Comput. Syst. (ICDCS) (2019). https://doi.org/10.1109/ICDCS.2019.00049
Article Google Scholar
Chen, C., Nagel, L., Cui, L., Tso, F. P.: "B-Scale: bottleneck-aware VNF scaling and flow routing in edge clouds," 2022 IEEE Symposium on Computers and Communications (ISCC), Rhodes, Greece. 1-6. (2022)
Chen, C., et al.: S-cache: function caching for serverless edge computing. Int. Workshop on Edge Syst., Anal. Netw. (2023). https://doi.org/10.1145/3578354.3592865
Article Google Scholar
Gias, A.U., Casale, G., Murray Woodside, C.: ATOM: model-driven autoscaling for microservices. In Int. Conf. Distrib. Comput. Syst. (ICDCS) (2019). https://doi.org/10.1109/ICDCS.2019.00197
Article Google Scholar
Guo, H., Cao, H., He, J., et al.: POBO: safe and optimal resource management for cloud microservices. SIGMETRICS Perform. Eval. Rev. 51(4), 20–21 (2024)
Article Google Scholar
Jiuchen, S., Hang, Z., Zhixin, T. et al., Nodens: enabling resource efficient and fast QoS recovery of dynamic microservice applications in datacenters, in USENIX Annual Technical Conference (USENIX ATC), 403–417. (2023)
Kwan, A., Wong, J., Jacobsen, H.-A., et al.: HyScale: hybrid and network scaling of dockerized microservices in cloud data centres. In Int. Conf. Distrib. Comput. Syst. (ICDCS) (2019). https://doi.org/10.1109/ICDCS.2019.00017
Article Google Scholar
Choochotkaew, S., Chiba, T., Trent, S., et al.: Run wild: resource management system with generalized modeling for microservices on cloud. In IEEE CLOUD (2021). https://doi.org/10.1109/CLOUD53861.2021.00079
Article Google Scholar
Luo, S., Huanle, X., Ye, K., et al.: Erms: efficient resource management for shared microservices with SLA guarantees. In Proc. 28th ACM Int. Conf. Archit. Support for Progr. Languages and Operating Syst. (ASPLOS) (2022). https://doi.org/10.1145/3567955.3567964
Article Google Scholar
Shafi, N., Abdullah, M., Iqbal, W., et al.: Cdascaler: a cost-effective dynamic autoscaling approach for containerized microservices. Clust. Comput. (2024). https://doi.org/10.1007/s10586-023-04228-y
Article Google Scholar
Jiang, Yi., Xue, J., Kun, Hu., et al.: Saver: a proactive microservice resource scheduling strategy based on STGCN. Clust. Comput. 27(10), 13547–13564 (2024)
Article Google Scholar
Zhang, S., Zhang, Y., Wang, R., et al.: MO-FreeVM: multi-objective server release algorithm for cluster resource management. Clust. Comput. 26(2), 1011–1034 (2023)
Article Google Scholar
Gan, Y., Zhang, Y., Cheng, D., et al.: An open-source benchmark suite for microservices and their hardware-software implications for cloud & edge systems. In ACM Int. Conf. Archit. Support for Progr. Languages and Operating Syst. (ASPLOS) (2019). https://doi.org/10.1145/3297858.3304013
Article Google Scholar
Phoronix Test Suite. Phoronix Test Suite. 2024; Available from: http://phoronix-test-suite.com/.
Cgroups. Control groups:physical resource isolation mechanisms in the linux kernel. 2024; Available from: https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html.
Vmstat. vmstat: performance profiling tool of linux. 2024; Available from: https://www.man7.org/linux/man-pages/man8/vmstat.8.html.
Ostroumova Prokhorenkova, L., Gusev, G., Vorobev, A. et al., CatBoost: unbiased boosting with categorical features, in Conference on Neural Information Processing Systems (NIPS. 6639–6649. (2018)
Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. In ACM SIGKDD Conf. Knowl. Discov. and Data Mining (KDD) (2016). https://doi.org/10.1145/2939672.2939785
Article Google Scholar
Ke, G., Meng, Q., Finley, T. et al., 2017 LightGBM: a highly efficient gradient boosting decision tree, in Conference on Neural Information Processing Systems (NIPS). 3146–3154. (2017)
Liu, Y., Liu, S., Wang, Y., et al.: A stochastic computational multi-layer perceptron with backward propagation. IEEE Trans. Comput. 67(9), 1273–1286 (2018)
Article MathSciNet Google Scholar
Evgeniou, T., Pontil, M.: Support vector machines: theory and applications. In: Paliouras, G., Karkaletsis, V., Spyropoulos, C.D. (eds.) Machine learning and its applications. Springer, Berlin (2001)
Google Scholar
Yoon, J., Zhang, Y., Jordon, J. et al., VIME: extending the success of self- and semi-supervised learning to tabular domain, in Conference on Neural Information Processing Systems (NIPS). (2020).
Schulman, J., Wolski, F., Dhariwal, P. et al., Proximal policy optimization algorithms. CoRR. abs/1707.06347. (2017)
Autoscaling. Kubernetes: horizontal pod autoscaling. 2024; Available from: https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#algorithm-details.
Cai, B., Wang, B., Yang, M., et al.: AutoMan: resource-efficient provisioning with tail latency guarantees for microservices. Future Gener. Comput. Syst. 143, 61–75 (2023)
Article Google Scholar

Download references