Mining taxonomies of process models (original) (raw)

Abstractions in Process Mining: A Taxonomy of Patterns

Lecture Notes in Computer Science, 2009

Process mining refers to the extraction of process models from event logs. Real-life processes tend to be less structured and more flexible. Traditional process mining algorithms have problems dealing with such unstructured processes and generate spaghetti-like process models that are hard to comprehend. One reason for such a result can be attributed to constructing process models from raw traces without due pre-processing. In an event log, there can be instances where the system is subjected to similar execution patterns/behavior. Discovery of common patterns of invocation of activities in traces (beyond the immediate succession relation) can help in improving the discovery of process models and can assist in defining the conceptual relationship between the tasks/activities. In this paper, we characterize and explore the manifestation of commonly used process model constructs in the event log and adopt pattern definitions that capture these manifestations, and propose a means to form abstractions over these patterns. We also propose an iterative method of transformation of traces which can be applied as a pre-processing step for most of today's process mining techniques. The proposed approaches are shown to identify promising patterns and conceptually-valid abstractions on a real-life log. The patterns discussed in this paper have multiple applications such as trace clustering, fault diagnosis/anomaly detection besides being an enabler for hierarchical process discovery.

Process Discovery from Event Data: Relating Models and Logs Through Abstractions

Event data are collected in logistics, manufacturing, finance, healthcare, customer relationship management, e-learning, e-government, and many other domains. The events found in these domains typically refer to activities executed by resources at particular times and for a particular case (i.e., process instances). Process mining techniques are able to exploit such data. In this article, we focus on process discovery. However, process mining also includes conformance checking, performance analysis, decision mining, organizational mining, predictions, recommendations, etc. These techniques help to diagnose problems and improve processes. All process mining techniques involve both event data and process models. Therefore, a typical first step is to automatically learn a control-flow model from the event data. This is very challenging, but in recent years many powerful discovery techniques have been developed. It is not easy to compare these techniques since they use different representations and make different assumptions. Users often need to resort to trying different algorithms in an ad-hoc manner. Developers of new techniques are often trying to solve specific instances of a more general problem. Therefore, we aim to unify existing approaches by focusing on log and model abstractions. These abstractions link observed and modeled behavior: Concrete behaviors recorded in event logs are related to possible behaviors represented by process models. Hence, such behavioral abstractions provide an "interface" between both. We discuss four discovery approaches involving three abstractions and different types of process models (Petri nets, block-structured models, and declarative models). The goal is to provide a comprehensive understanding of process discovery and show how to develop new techniques. Examples illustrate the different approaches and pointers to software are given. The discussion on abstractions and process representations is also used to reflect on the gap between process mining literature and commercial process mining tools. This facilitates users to select an appropriate process discovery technique. Moreover, structuring the role of internal abstractions and representations helps to broaden the view and facilitates the creation of new discovery approaches.

Workflow mining: Discovering process models from event logs

IEEE Transactions on Knowledge and Data Engineering, 2003

Contemporary workflow management systems are driven by explicit process models, i.e., a completely specified workflow design is required in order to enact a given workflow process. Creating a workflow design is a complicated time-consuming process and, typically, there are discrepancies between the actual workflow processes and the processes as perceived by the management. Therefore, we have developed techniques for discovering workflow models. The starting point for such techniques is a so-called "workflow log" containing information about the workflow process as it is actually being executed. We present a new algorithm to extract a process model from such a log and represent it in terms of a Petri net. However, we will also demonstrate that it is not possible to discover arbitrary workflow processes. In this paper, we explore a class of workflow processes that can be discovered. We show that the -algorithm can successfully mine any workflow represented by a so-called SWF-net.

Mining Expressive Process Models by Clustering Workflow Traces

Lecture Notes in Computer Science, 2004

We propose a general framework for the process mining problem which encompasses the assumption of workflow schema with local constraints only, for it being applicable to more expressive specification languages, independently of the particular syntax adopted. In fact, we provide an effective technique for process mining based on the rather unexplored concept of clustering workflow executions, in which clusters of executions sharing the same structure and the same unexpected behavior (w.r.t. the local properties) are seen as a witness of the existence of global constraints. An interesting framework for assessing the similarity between the original model and the discovered one is proposed, as well as some experimental results evidencing the validity of our approach.

Discovering Multi-perspective Process Models: The Case of Loosely-Structured Processes

Lecture Notes in Business Information Processing, 2009

Process Mining techniques exploit the information stored in the executions log of a process in order to extract some high-level process model, which can be used for both analysis and design tasks. Most of these techniques focus on "structural" (control-flow oriented) aspects of the process, in that they only consider what elementary activities were executed and in which ordering. In this way, any other "non-structural" information, usually kept in real log systems (e.g., activity executors, parameter values, and time-stamps), is completely disregarded, yet being a potential source of knowledge. In this paper, we overcome this limitation by proposing a novel approach for discovering process models, where the behavior of a process is characterized from both structural and non-structural viewpoints. In a nutshell, different variants of the process (classes) are recognized through a structural clustering approach, and represented with a collection of specific workflow models. Relevant correlations between these classes and non-structural properties are made explicit through a rule-based classification model, which can be exploited for both explanation and prediction purposes. Results on reallife application scenario evidence that the discovered models are often very accurate and capture important knowledge on the process behavior.

Guided Process Discovery -A Pattern-based Approach

Process mining techniques analyze processes based on events stored in event logs. Yet, low-level events recorded by information systems may not directly match high-level activities that make sense to process stakeholders. This results in discovered process models that cannot be easily understood. To prevent such situations from happening, low-level events need to be translated into highlevel activities that are recognizable by stakeholders. This paper proposes the Guided Process Discovery method (GPD). Low-level events are grouped based on behavioral activity patterns, which capture domain knowledge on the relation between high-level activities and low-level events. Events in the resulting abstracted event log correspond to instantiations of high-level activities. We validate process models discovered on the abstracted event log by checking conformance between the low-level event log and an expanded model in which the high-level activities are replaced by activity patterns. The method was tested using two real-life event logs. We show that the process models discovered with the GPD method are more comprehensible and can be used to answer process questions, whereas process models discovered using standard process discovery techniques do not provide the insights needed.

A systematic mapping study of process mining

Enterprise Information Systems, 2017

Web service mining and verification of properties: An approach based on event calculus 2006 C IEEE Int. Conf. on Computer Supported Cooperative Work in Design (CSCWD) Yan, L.; Yuqiang, F. Design of an automatic workflow modeling method in cooperative WFMS 2006 C Int. WS on Data Engineering Issues in E-Commerce and Services (DEECS) Gaaloul, W.; Baina, K.; Godart, C. A bottom-up workflow mining approach for workflow applications analysis 2006 C Int. WS on Database and Expert Systems Applications (DEXA) Curia, R.; Gallucci, L.; Ruffolo, M. Knowledge management in health care: An architectural framework for clinical process management systems 2006 C Eur. WS on Inductive Databases and Constraint Based Mining (EWIDCBM)

Discovering Expressive Process Models by Clustering Log Traces

IEEE Transactions on Knowledge and Data Engineering, 2006

Process mining techniques have recently received notable attention in the literature for their ability to assist in the (re)design of complex processes by automatically discovering models that explain the events registered in some log traces provided as input. Following this line of research, the paper investigates an extension of such basic approaches, where the identification of different variants for the process is explicitly accounted for, based on the clustering of log traces. Indeed, modeling each group of similar executions with a different schema allows us to single out "conformant" models, which, specifically, minimize the number of modeled enactments that are extraneous to the process semantics. Therefore, a novel process mining framework is introduced and some relevant computational issues are deeply studied. As finding an exact solution to such an enhanced process mining problem is proven to require high computational costs, in most practical cases, a greedy approach is devised. This is founded on an iterative, hierarchical, refinement of the process model, where, at each step, traces sharing similar behavior patterns are clustered together and equipped with a specialized schema. The algorithm guarantees that each refinement leads to an increasingly sound model, thus attaining a monotonic search. Experimental results evidence the validity of the approach with respect to both effectiveness and scalability.

Development of the Process Mining Discipline

It is exciting to see the spectacular developments in process mining since I started to work on this in the late 1990-ties. Many of the techniques we developed 15-20 years ago have become standard functionality in today's process mining tools. Therefore, it is good to view current and future developments in this historical context. This chapter starts with a brief summary of the history of process mining showing how ideas from academia got adopted in commercial tools. This provides the basis to talk about the expanding scope of process mining, both in terms of applications and in terms of functionalities supported. Despite the rapid development of the process mining discipline, there are still several challenges. Some of these challenges are new, but there are also several challenges that have been around for a while and still need to be addressed urgently. This requires the concerted action of process mining users, technology providers, and scientists. Adoption of traditional process mining techniques Process mining started in the late nineties when I had a sabbatical and was working for one year at the University of Colorado in Boulder (USA). Before, I was mostly focusing on concurrency theory, discrete event simulation, and workflow management. We had built our own simulation engines (e.g., ExSpect) and workflow management systems. Although our research was well-received and influential, I was disappointed by the average quality of process models and the impact process models had on reality. In both simulation studies and workflow implementations, the real processes often turned out to be very different from what was modeled by the people involved. As a result, workflow and simulation projects often failed. Therefore, I decided to focus on the analysis of processes through event data [1]. Around the turn of the century, we developed the first process discovery algorithms [2]. The Alpha algorithm was the first algorithm able to learn concurrent process models from event data and still provide formal guarantees. However, at the time, little event data were available and the assumptions made by the first algorithms were unrealistic. People working on data mining and machine learning were (and perhaps still are) not interested in process analysis. Therefore, it was not easy to convince other researchers to work on this. Nevertheless, for me, it was crystal clear that process mining would become a crucial ingredient of any process management or process improvement initiative. In the period that followed, I stopped working on the traditional business process management topics and fully focused on process mining. It is interesting to see that concepts such as conformance checking, organizational process mining, decision mining, token animation, time prediction, etc. were already developed and implemented 15 years ago [2]. These capabilities are still considered to be cutting-edge and not supported by most of the commercial process mining tools.