Towards a cloud-based framework for online and integrated event detection (original) (raw)

Event monitoring and observability for industrial systems on Azure cloud

2021

Cloud computing is a paradigm shift transforming data processing, communications and storage. It offers a cost-effective method that facilitates real-time data collection, storage and exchange by providing services such as compute, processing, storage and networking. This allows customers and enterprises to access any data or application from anywhere in the world over an Internet connection. This computation and availability model is a perfect fit for critical industrial applications such as power generation. The number and types of deployed sensors is on the increase requiring large scale storage, networking, and processing available on cloud platforms. We developed an Azure based application that generates a motion triggered event notification and provides access to the live video stream of the event location. The event is also recorded in a database and an email alert is sent to the subscribed operator. The implementation shows the importance of such systems for industrial applications requiring timely access to information.

Pattern Recognition and Event Detection on IoT Data-streams

2022

Big data streams are possibly one of the most essential underlying notions. However, data streams are often challenging to handle owing to their rapid pace and limited information lifetime. It is difficult to collect and communicate stream samples while storing, transmitting and computing a function across the whole stream or even a large segment of it. In answer to this research issue, many streaming-specific solutions were developed. Stream techniques imply a limited capacity of one or more resources such as computing power and memory, as well as time or accuracy limits. Reservoir sampling algorithms choose and store results that are probabilistically significant. A weighted random sampling approach using a generalised sampling algorithmic framework to detect unique events is the key research goal of this work. Briefly, a gradually developed estimate of the joint stream distribution across all feasible components keeps k stream elements judged representative for the full stream. Once estimate confidence is high, k samples are chosen evenly. The complexity is O (min(k, n − k)), where n is the number of items inspected. Due to the fact that events are usually considered outliers, it is sufficient to extract element patterns and push them to an alternate version of k-means as proposed here. The suggested technique calculates the sum of squared errors (SSE) for each cluster, and this is utilised not only as a measure of convergence, but also as a quantification and an indirect assessment of the element distribution's approximation accuracy. This clustering enables for the detection of outliers in the stream based on their distance from the usual event centroids. The findings reveal that weighted sampling and res-means outperform typical approaches for stream event identification. Detected events are shown as knowledge graphs, along with typical clusters of events.

Real-time event management in cloud environments

International Journal of High Performance Computing and Networking, 2015

Many applications, especially the ones implementing multimedia streaming, fall within the context of real-time systems in which only small deviations from timing constraints are allowed. The advancements in distributed computing have made it possible to follow a service-oriented approach, taking advantage of the benefits this provides. Cloud computing is a paradigm that aims to transform computer, storage and network resources into a utility. As more applications are deployed on cloud environments, one of the main requirements refers to real time event management during the application execution. In this paper, we present two mechanisms that aim at addressing this requirement: an adaptable two layer monitoring mechanism which generates appropriate events and an event processing framework to consume these events. We evaluate the effectiveness of these mechanisms through a set of experiments on a large scale multi-cloud facility. The latter poses challenges with respect to time-constrained execution of applications, since the aforementioned mechanisms need to collect and analyse information from geographically distributed sites, and trigger scaling decisions during the real-time application execution. The experimentation outcomes demonstrate the value of the presented mechanisms, both in terms of efficient scalability and with respect to the introduced overhead on the infrastructure.

Distributed Complex Event Processing in Multiclouds

2018

The last few years, the generation of vast amounts of heterogeneous data with different velocity and veracity and the requirement to process them, has significantly challenged the computational capacity and efficiency of the modern infrastructural resources. The propagation of Big Data among different processing and storage architectures, has amplified the need for adequate and cost-efficient infrastructures to host them. An overabundance of cloud service offerings is currently available and is being rapidly adopted by small and medium enterprises based on its many benefits to traditional computing models. However, at the same time the Big Data computing requirements pose new research challenges that question the adoption of single cloud provider resources. Nowadays, we discuss the emerging data-intensive applications that necessitate the wide adoption of multicloud deployment models, in order to use all the advantages of cloud computing. A key tool for managing such multicloud appl...

Event detection from time series data

Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '99, 1999

In the past few years there has been increased interest in using data-mining techniques to extract interesting patterns from time series data generated by sensors monitoring temporally varying phenomenon.

Recent Advancements in Event Processing

ACM Computing Surveys, 2019

Event processing (EP) is a data processing technology that conducts online processing of event information. In this survey, we summarize the latest cutting-edge work done on EP from both industrial and academic research community viewpoints. We divide the entire field of EP into three subareas: EP system architectures, EP use cases, and EP open research topics. Then we deep dive into the details of each subsection. We investigate the system architecture characteristics of novel EP platforms, such as Apache Storm, Apache Spark, and Apache Flink. We found significant advancements made on novel application areas, such as the Internet of Things; streaming machine learning (ML); and processing of complex data types such as text, video data streams, and graphs. Furthermore, there has been significant body of contributions made on event ordering, system scalability, development of EP languages and exploration of use of heterogeneous devices for EP, which we investigate in the latter half o...

Analysis Cloud - Running Sensor Data Analysis Programs on a Cloud Computing Infrastructure

International Conference on Cloud Computing and Services Science, 2013

Sensors have been used for many years to gather information about their environment. The number of sensors connected to the internet is increasing, which has led to a growing demand of data transport and storage capacity. In addition, evermore emphasis is put on processing the data to detect anomalous situations and to identify trends. This paper presents a sensor data analysis platform that executes statistical analysis programs on a cloud computing infrastructure. Compared to existing batch and stream processing platforms, it adds the notion of simulated time, i.e. time that differs from the actual, current time. Moreover, it adds the ability to dynamically schedule the analysis programs based on a single timestamp, recurring schedule, or on the sensor data itself.

A Complete Software Stack for IoT Time-Series Analysis that Combines Semantics and Machine Learning—Lessons Learned from the Dyversify Project

Applied Sciences, 2021

Companies are increasingly gathering and analyzing time-series data, driven by the rising number of IoT devices. Many works in literature describe analysis systems built using either data-driven or semantic (knowledge-driven) techniques. However, little to no works describe hybrid combinations of these two. Dyversify, a collaborative project between industry and academia, investigated how event and anomaly detection can be performed on time-series data in such a hybrid setting. We built a proof-of-concept analysis platform, using a microservice architecture to ensure scalability and fault-tolerance. The platform comprises time-series ingestion, long term storage, data semantification, event detection using data-driven and semantic techniques, dynamic visualization, and user feedback. In this work, we describe the system architecture of this hybrid analysis platform and give an overview of the different components and their interactions. As such, the main contribution of this work is...

A Novel Technique for Long-Term Anomaly Detection in the Cloud

High availability and performance of a web service is key, amongst other factors, to the overall user experience (which in turn directly impacts the bottom-line). Exogenic and/or endogenic factors often give rise to anomalies that make maintaining high availability and delivering high performance very challenging. Although there exists a large body of prior research in anomaly detection, existing techniques are not suitable for detecting long-term anomalies owing to a predominant underlying trend component in the time series data.

Incident detection for cloud environments

Security and privacy concerns hinder a broad adoption of cloud computing in industry. In this paper we identify cloud specific security risks and introduce the cloud incident detection system Security Audit as a Service (SAaaS). SAaaS is built on autonomous distributed agents feeding a complex event processing engine, informing about a cloud's security state. In addition to technical monitoring factors like number of open network connections business process flows can be modelled to detect customer overlapping security incidents. In case of identified attacks actions can be defined to protect the cloud service assets. As contribution of this paper we provide a high-level design of the SAaaS architecture and a first prototype of a virtual machine agent. We show how an incident detection system for a cloud environment should be designed to address cloud specific security problems.