Managing latency in edge–cloud environment (original) (raw)
Related papers
Operating Latency Sensitive Applications on Public Serverless Edge Cloud Platforms
IEEE Internet of Things Journal
Cloud native programming and serverless architectures provide a novel way of software development and operation. A new generation of applications can be realized with features never seen before while the burden on developers and operators will be reduced significantly. However, latency sensitive applications, such as various distributed IoT services, generally do not fit in well with the new concepts and today's platforms. In this article, we adapt the cloud native approach and related operating techniques for latency sensitive IoT applications operated on public serverless platforms. We argue that solely adding cloud resources to the edge is not enough and other mechanisms and operation layers are required to achieve the desired level of quality. Our contribution is threefold. First, we propose a novel system on top of a public serverless edge cloud platform, which can dynamically optimize and deploy the microservice-based software layout based on live performance measurements. We add two control loops and the corresponding mechanisms which are responsible for the online reoptimization at different timescales. The first one addresses the steady-state operation, while the second one provides fast latency control by directly reconfiguring the serverless runtime environments. Second, we apply our general concepts to one of today's most widely used and versatile public cloud platforms, namely, Amazon's AWS, and its edge extension for IoT applications, called Greengrass. Third, we characterize the main operation phases and evaluate the overall performance of the system. We analyze the performance characteristics of the two control loops and investigate different implementation options.
Achieving Predictable and Low End-to-End Latency for a Network of Smart Services
2018 IEEE Global Communications Conference (GLOBECOM), 2018
To remain competitive in the field of manufacturing today, companies must constantly improve the automation loops within their production plants. This can be done by augmenting the automation applications with "smart services" such as supervisory-control applications or machine-learning inference algorithms. The downside is that these smart services are often hosted in a cloud infrastructure and the automation applications require a low and predictable end-to-end latency. However, with the 5G technology it will become possible to establish a low-latency connection to the cloud infrastructure and with proper control of the capacity of the smart services, it will become possible to achieve a low and predictable endto-end latency for the augmented automation applications. In this work we address the challenge of controlling the capacity of the smart services in a way that achieves a low and predictable end-to-end latency. We do this by deriving a mathematical framework that models a network of smart services that is hosting several automation applications. We propose a generalized AutoSAC (automatic service-and admission controller) that builds on previous work by the authors [1], [2]. In the previous work the system was only capable of handling a single set of smart services, with a single application hosted on top of it. With the contributions of this paper it becomes possible to host multiple applications on top of a larger, more general network of smart services.
Towards Virtualization-Agnostic Latency for Time-Sensitive Applications
29th International Conference on Real-Time Networks and Systems
As time-sensitive applications are deployed spanning multiple edge clouds, delivering consistent and scalable latency performance across different virtualized hosts becomes increasingly challenging. In contrast to traditional real-time systems requiring deadline guarantees for all jobs, the latency service-level objectives of cloud applications are usually defined in terms of tail latency, i.e., the latency of a certain percentage of the jobs should be below a given threshold. This means that neither dedicating entire physical CPU cores, nor combining virtualization with deadline-based techniques such as compositional real-time scheduling, can meet the needs of these applications in a resource-efficient manner. To address this limitation, and to simplify the management of edge clouds for latency-sensitive applications, we introduce virtualization-agnostic latency (VAL) as an essential property to maintain consistent tail latency assurances across different virtualized hosts. VAL requires that an application experience similar latency distributions on a shared host as on a dedicated one. Towards achieving VAL in edge clouds, this paper presents a virtualizationagnostic scheduling (VAS) framework for time-sensitive applications sharing CPUs with other applications. We show both theoretically and experimentally that VAS can effectively deliver VAL on shared hosts. For periodic and sporadic tasks, we establish theoretical guarantees that VAS can achieve the same task schedule on a shared CPU as on a full CPU dedicated to time-sensitive services. Moreover, this can be achieved by allocating the minimal CPU bandwidth to time-sensitive services, thereby avoiding wasting CPU resources.
Challenges in real-time virtualization and predictable cloud computing
Journal of Systems Architecture, 2014
Cloud computing and virtualization technology have revolutionized general-purpose computing applications in the past decade. The cloud paradigm offers advantages through reduction of operation costs, server consolidation, flexible system configuration and elastic resource provisioning. However, despite the success of cloud computing for general-purpose computing, existing cloud computing and virtualization technology face tremendous challenges in supporting emerging soft real-time applications such as online video streaming, cloud-based gaming, and telecommunication management. These applications demand real-time performance in open, shared and virtualized computing environments. This paper identifies the technical challenges in supporting real-time applications in the cloud, surveys recent advancement in real-time virtualization and cloud computing technology, and offers research directions to enable cloud-based real-time applications in the future.
Dynamic Resource Management Across Cloud-Edge Resources for Performance-Sensitive Applications
2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID), 2017
A large number of modern applications and systems are cloud-hosted, however, limitations in performance assurances from the cloud, and the longer and often unpredictable endto-end network latencies between the end user and the cloud can be detrimental to the response time requirements of the applications, specifically those that have stringent Quality of Service (QoS) requirements. Although edge resources, such as cloudlets, may alleviate some of the latency concerns, there is a general lack of mechanisms that can dynamically manage resources across the cloud-edge spectrum. To address these gaps, this research proposes Dynamic Data Driven Cloud and Edge Systems (D 3 CES). It uses measurement data collected from adaptively instrumenting the cloud and edge resources to learn and enhance models of the distributed resource pool. In turn, the framework uses the learned models in a feedback loop to make effective resource management decisions to host applications and deliver their QoS properties. D 3 CES is being evaluated in the context of a variety of cyber physical systems, such as smart city, online games, and augmented reality applications.
International Journal of All Research Education and Scientific Methods , 2023
The proliferation of edge computing devices and the omnipresent cloud services have given rise to the concept of an Edge-to-Cloud Continuum. This research-oriented descriptive article delves into the development of a seamless continuum between edge computing devices and cloud services, emphasizing the optimization of resource allocation and data flow for real-time applications. The article explores the principles of edge computing, the challenges of integration with cloud services, and the transformative impact on various industries. Additionally, it includes keywords, relevant studies, and a comprehensive list of references.
Latency Optimization in Large-Scale Cloud-Sensor Systems
Open J. Internet Things, 2017
With the advent of the Internet of Things and smart city applications, massive cyber-physical interactions between the applications hosted in the cloud and a huge number of external physical sensors and devices is an inevitable situation. This raises two main challenges: cloud cost affordability as the smart city grows (referred to as economical cloud scalability) and the energy-efficient operation of sensor hardware. We have developed Cloud-Edge-Beneath (CEB), a multi-tier architecture for large-scale IoT deployments, embodying distributed optimizations, which address these two major challenges. In this article, we summarize our prior work on CEB to set context for presenting a third major challenge for cloud sensor-systems, which is latency. Prolonged latency can potentially arise in servicing requests from cloud applications, especially given our primary focus on optimizing energy and cloud scalability. Latency, however, is an important factor to optimize for real-time and cyber-...
A cloud middleware for assuring performance and high availability of soft real-time applications
Journal of Systems Architecture, 2014
Applications are increasingly being deployed in the cloud due to benefits stemming from economy of scale, scalability, flexibility and utility-based pricing model. Although most cloud-based applications have hitherto been enterprise-style, there is an emerging need for hosting real-time streaming applications in the cloud that demand both high availability and low latency. Contemporary cloud computing research has seldom focused on solutions that provide both high availability and real-time assurance to these applications in a way that also optimizes resource consumption in data centers, which is a key consideration for cloud providers. This paper makes three contributions to address this dual challenge. First, it describes an architecture for a fault-tolerant framework that can be used to automatically deploy replicas of virtual machines in data centers in a way that optimizes resources while assuring availability and responsiveness. Second, it describes the design of a pluggable framework within the fault-tolerant architecture that enables plugging in different placement algorithms for VM replica deployment. Third, it
Journal of Parallel and Distributed Computing, 2018
The steep rise of Internet of Things (IoT) applications along with the limitations of Cloud Computing to address all IoT requirements leveraged a new distributed computing paradigm called Fog Computing, which aims to process data at the edge of the network. With the help of Fog Computing, the transmission latency, monetary spending and application loss caused by Cloud Computing can be effectively reduced. However, as the processing capacity of fog nodes is more limited than that of cloud platforms, running all applications indiscriminately on these nodes can cause some QoS requirement to be violated. Therefore, there is important decision-making as to where executing each application in order to produce a cost effective solution and fully meet application requirements. In particular, we are interested in the tradeoff in terms of average response time, average cost and average number of application loss. In this paper, we present an online algorithm, called unit-slot optimization, based on the technique of Lyapunov optimization. The unit-slot optimization is a quantified near-optimal online solution to balance the three-way tradeoff among average response time, average cost and average number of application loss. We evaluate the performance of the unit-slot optimization algorithm by a number of experiments. The experimental results not only match up the theoretical analyses properly, but