Statistical Verification of Process Conformance Based on Log Equality Test (original) (raw)
Related papers
Evaluating Conformance Measures in Process Mining using Conformance Propositions
Process mining sheds new light on the relationship between process models and real-life processes. Process discovery can be used to learn process models from event logs. Conformance checking is concerned with quantifying the quality of a business process model in relation to event data that was logged during the execution of the business process. There exist different categories of conformance measures. Recall, also called fitness, is concerned with quantifying how much of the behavior that was observed in the event log fits the process model. Precision is concerned with quantifying how much behavior a process model allows for that was never observed in the event log. Generalization is concerned with quantifying how well a process model generalizes to behavior that is possible in the business process but was never observed in the event log. Many recall, precision, and generalization measures have been developed throughout the years, but they are often defined in an ad-hoc manner without formally defining the desired properties up front. To address these problems, we formulate 21 conformance propositions and we use these propositions to evaluate current and existing conformance measures. The goal is to trigger a discussion by clearly formulating the challenges and requirements (rather than proposing new measures). Additionally, this paper serves as an overview of the conformance checking measures that are available in the process mining area.
Evaluating Conformance Measures in Process Mining using Conformance Propositions (Extended Version
Process mining sheds new light on the relationship between process models and real-life processes. Process discovery can be used to learn process models from event logs. Conformance checking is concerned with quantifying the quality of a business process model in relation to event data that was logged during the execution of the business process. There exist different categories of conformance measures. Recall, also called fitness, is concerned with quantifying how much of the behavior that was observed in the event log fits the process model. Precision is concerned with quantifying how much behavior a process model allows for that was never observed in the event log. Generalization is concerned with quantifying how well a process model generalizes to behavior that is possible in the business process but was never observed in the event log. Many recall, precision, and generalization measures have been developed throughout the years, but they are often defined in an ad-hoc manner without formally defining the desired properties up front. To address these problems, we formulate 21 conformance propositions and we use these propositions to evaluate current and existing conformance measures. The goal is to trigger a discussion by clearly formulating the challenges and requirements (rather than proposing new measures). Additionally, this paper serves as an overview of the conformance checking measures that are available in the process mining area.
Entropia: A Family of Entropy-Based Conformance Checking Measures for Process Mining
ICPM Demos, 2020
This paper presents a command-line tool, called Entropia, that implements a family of conformance checking measures for process mining founded on the notion of entropy from information theory. The measures allow quantifying classical non-deterministic and stochastic precision and recall quality criteria for process models automatically discovered from traces executed by IT-systems and recorded in their event logs. A process model has "good" precision with respect to the log it was discovered from if it does not encode many traces that are not part of the log, and has "good" recall if it encodes most of the traces from the log. By definition, the measures possess useful properties and can often be computed quickly.
A Comprehensive Process Similarity Measure Based on Models and Logs
IEEE Access, 2019
Process similarity measure plays an important role in business process management and is usually considered as a versatile solution to fulfill the effective utilization of process models. Although many studies have worked on different notions of process similarity, most of them are not precise enough, as they simply compare processes with respect to the model structure features or the model behavior features separately. To address the problem, in this paper, we propose to measure the business process similarity by considering both process models and process logs. The process models are pre-defined descriptions of business processes, and the process logs can be considered as an objective observation of the actual process execution behavior. The combination of both can help to better character business processes. More specifically, two effective frameworks together with four novel approaches are presented. The former first constructs a weighted business process graph (WBPG) from the process model and the process log, and then computes the similarity of two corresponding WBPGs by using the weighted graph edit distance measure and the weighted node adjacent relation similarity measure. The latter first measures the similarity of process logs and the similarity of process models separately, and then merges the results. Finally, by experimental evaluation, we demonstrate the effectiveness and the applicability of the proposed approaches by comparing them with the start of the art.
Replaying History on Process Models for Conformance Checking and Performance Analysis
Process mining techniques use event data to discover process models, to check the conformance of predefined process models, and to extend such models with information about bottlenecks, decisions, and resource usage. These techniques are driven by observed events rather than handmade models. Event logs are used to learn and enrich process models. By replaying history on the model, it is possible to establish a precise relationship between events and model elements. This relationship can be used to check conformance and to analyze performance. For example, it is possible to diagnose deviations from the modeled behavior. The severity of each deviation can be quantified. Moreover, the relationship established during replay and the timestamps in the event log can be combined to show bottlenecks. These examples illustrate the importance of maintaining a proper alignment between event log and process model. Therefore, we elaborate on the realization of such alignments and their application to conformance checking and performance analysis.
Process Equivalence: Comparing Two Process Models Based on Observed Behavior
Lecture Notes in Computer Science, 2006
In various application domains there is a desire to compare process models, e.g., to relate an organization-specific process model to a reference model, to find a web service matching some desired service description, or to compare some normative process model with a process model discovered using process mining techniques. Although many researchers have worked on different notions of equivalence (e.g., trace equivalence, bisimulation, branching bisimulation, etc.), most of the existing notions are not very useful in this context. First of all, most equivalence notions result in a binary answer (i.e., two processes are equivalent or not). This is not very helpful, because, in real-life applications, one needs to differentiate between slightly different models and completely different models. Second, not all parts of a process model are equally important. There may be parts of the process model that are rarely activated while other parts are executed for most process instances. Clearly, these should be considered differently. To address these problems, this paper proposes a completely new way of comparing process models. Rather than directly comparing two models, the process models are compared with respect to some typical behavior. This way we are able to avoid the two problems. Although the results are presented in the context of Petri nets, the approach can be applied to any process modeling language with executable semantics.
Scalable Process Discovery and Conformance Checking
Considerable amounts of data, including process event data, are collected and stored by organisations nowadays. Discovering a process model from recorded process event data and verification of the quality of discovered models are important steps in process mining. Many discovery techniques have been proposed, but none combines scalability with quality guarantees. We would like such techniques to handle billions of events or thousands of activities, to produce sound models (without deadlocks and other anomalies), and to guarantee that the underlying process can be rediscovered when sufficient information is available. In this paper, we introduce a framework for process discovery that ensures these properties while passing over the log only once and we introduce three algorithms using the framework. To measure the quality of discovered models on these large logs, we introduce a model-model and model-log comparison framework that applies a divide-and-conquer strategy to measure recall, fitness and precision. We experimentally show that these discovery and measuring techniques sacrifice little compared to other algorithms, while gaining the ability to cope with event logs of 100,000,000 traces and processes of 10,000 activities.
Conformance Checking of Partially Matching Processes: An Entropy-Based Approach
Elsevier Information Systems, 2021
Conformance checking is an area of process mining that studies methods for measuring and characterizing commonalities and discrepancies between processes recorded in event logs of IT-systems and designed processes, either captured in explicit process models or implicitly induced by information systems. Applications of conformance checking range from measuring the quality of models automatically discovered from event logs, via regulatory process compliance, to automated process enhancement. Recently, process mining researchers initiated a discussion on the desired properties the conformance measures should possess. This discussion acknowledges that existing measures often do not satisfy the desired properties. Besides, there is a lack of understanding by the process mining community of the desired properties for conformance measures that address partially matching processes, i.e., processes that are not identical but differ in some process steps. In this article, we extend the recently introduced precision and recall conformance measures between an event log and process model that are based on the concept of entropy from information theory to account for partially matching processes. We discuss the properties the presented extended measures inherit from the original measures as well as properties for partially matching processes the new measures satisfy. All the presented conformance measures have been implemented in a publicly available tool. We present qualitative and quantitative evaluations based on our implementation that show the feasibility of using the proposed measures in industrial settings.
An Entropic Relevance Measure for Stochastic Conformance Checking in Process Mining
International Conference on Process Mining, 2020
Given an event log as a collection of recorded realworld process traces, process mining aims to automatically construct a process model that is both simple and provides a useful explanation of the traces. Conformance checking techniques are then employed to characterize and quantify commonalities and discrepancies between the log's traces and the candidate models. Recent approaches to conformance checking acknowledge that the elements being compared are inherently stochastic-for example, some traces occur frequently and others infrequentlyand seek to incorporate this knowledge in their analyses. Here we present an entropic relevance measure for stochastic conformance checking, computed as the average number of bits required to compress each of the log's traces, based on the structure and information about relative likelihoods provided by the model. The measure penalizes traces from the event log not captured by the model and traces described by the model but absent in the event log, thus addressing both precision and recall quality criteria at the same time. We further show that entropic relevance is computable in time linear in the size of the log, and provide evaluation outcomes that demonstrate the feasibility of using the new approach in industrial settings.
Conformance Checking Techniques of Process Mining: A Survey
Recent Trends in Intensive Computing, 2021
Conformance Checking (CC) techniques enable us to gives the deviation between modelled behavior and actual execution behavior. The majority of organizations have Process-Aware Information Systems for recording the insights of the system. They have the process model to show how the process will be executed. The key intention of Process Mining is to extracting facts from the event log and used them for analysis, ratification, improvement, and redesigning of a process. Researchers have proposed various CC techniques for specific applications and process models. This paper has a detailed study of key concepts and contributions of Process Mining. It also helps in achieving business goals. The current challenges and opportunities in Process Mining are also discussed. The survey is based on CC techniques proposed by researchers with key objectives like quality parameters, perspective, algorithm types, tools, and achievements.