BPM2017 - Integrated Modeling and Verification of Processes and Data Part 1: Introduction (original) (raw)
1. Integrated Modelling and Verification of Processes and Data Diego Calvanese Marco Montali {calvanese,montali}@inf.unibz.it Free University of Bozen-Bolzano BPM 2017
2. Marrying processes and data is extremely challenging…. … but is a must if we want to really understand how complex dynamic systems operate. 2
3. Two Questions How to formally and conceptually account for the process+data interplay? How to verify such BPMs? N.B.: modeling and verification go side-by-side 3
4. Two Questions How to formally and conceptually account for the process+data interplay? How to verify such BPMs? N.B.: modeling and verification go side-by-side 4 Business Turing Machines BTMs
6. Outline Part 1 • Introduction and motivation: why processes + data • A quick tour through the literature and integrated models Part 2 • The framework of Data-Centric Dynamic Systems • Verification results Part 3 • Connection to concrete integrated models and systems • Concluding remarks 6
7. Information Assets • Data: the main information source about the history of the domain of interest and the relevant aspects of the current state of affairs • Processes: how work is orchestrated in the domain of interest, so as to create value • Resources: humans and devices responsible for the execution of work units within a process 7
8. 50% data models 50% configure/ deploy diagnose/ get reqs. enact/ monitor (re) design adjust IT support reality (knowledge) workers managers/ analysts
9. Is this Synergy Reflected by BP Methods and Models? Survey by Forrester [Karel et al, 2009]: lack of interaction between data and process experts. • BPM professionals: data are subsidiary to processes • Master data managers: data are the main driver for the company’s existence • 83/100 companies: no interaction at all between these two groups • This isolation propagates to models, languages and tools 9
10. 10 Management [models] Workers [reality] Experience Dichotomy
11. 11 Management Dichotomy Business [decision making] IT [infrastructure]
12. 12 Expertise Dichotomy Master Data Management Business Process Management
14. 1. Customer PO 14 Example: Order-To-Delivery
15. 1. Customer PO 2. order decomposition Material PO Line item Customer PO 15
16. 3. Selection and interaction with suppliers 1. Customer PO 2. order decomposition Material PO Line item Customer PO 16
17. 3. Selection and interaction with suppliers 1. Customer PO 2. order decomposition Material PO Line item Customer PO 17
18. 3. Selection and interaction with suppliers 1. Customer PO 2. order decomposition Material PO Line item Customer PO 4. material assembly18
19. 3. Selection and interaction with suppliers 1. Customer PO 2. order decomposition Material PO Line item Customer PO 4. material assembly 5. Shipment 19
20. Observations • A complex process, where the company acts as an intermediate hub between customers and suppliers • Happy path 1) The customer issues a purchase order 2) The ordered material is obtained from suppliers 3) The material is shipped, possibly using different packages • One exceptional path (in general, there are many): 1) The customer cancels the order 2) A cancelation policy is applied to calculate a penalty 20
21. Conventional Data Modeling Focus: revelant entities, relations, static constraints Supplier ManufacturingProcurement/Supplier Sales Customer PO Line Item Work OrderMaterial PO * * spawns 0..1 Material But… how do data evolve? Where can we find the “state” of a purchase order? 21 UML class diagram
22. Conventional Process Modeling Focus: control-flow of activities in response to events But… how do activities update data? What is the impact of canceling an order? 22 BPMN collaborative process
24. Do you like Spaghetti? Manage Cancelation ShipAssemble Manage Material POs Decompose Customer PO Activities Process Data Activities Process Data Activities Process Data Activities Process Data Activities Process Data Customers Suppliers&CataloguesCustomer POs Work Orders Material POs IT integration: difficult to manage, understand, maintain 24
25. Too Late! • Where are the data? • Where shall we model relevant business rules? • Consider an order cancelation policy that needs to check which material has been already shipped towards determining the customer penalty… 25 o late to reconstruct the missing pieces Where is our data? part is in the DBs, part is hidden in the process execution engine. Where are the relevant business rules, and how are they modeled? At the DB level? Which DB? How to import the process data? (Also) in the business model? How to import data from the DBs? DataProcess Supplier ManufacturingProcurement/Supplier Sales Customer PO Line Item Work OrderMaterial PO * * spawns 0..1 Determine cancelation penalty Notify penalty Material Process Engine Process State Business rules For each work order W For each material PO M in W if M has been shipped add returnCost(M) to penalty
26. 26
27. …There is Hope! 27 data-centric … … … activity-centric 1 9 9 8 … 2 0 0 3 2 0 0 4 2 0 0 5 2 0 0 6 2 0 0 7 2 0 0 8 2 0 0 9 2 0 1 0 2 0 1 1 2 0 1 2 2 0 1 3 2 0 1 4 2 0 1 5 2 0 1 6 2 0 1 7 N.B.: these are “sparse” dots!!!
28. 28 data-centric … … … activity-centric 1 9 9 8 … 2 0 0 3 2 0 0 4 2 0 0 5 2 0 0 6 2 0 0 7 2 0 0 8 2 0 0 9 2 0 1 0 2 0 1 1 2 0 1 2 2 0 1 3 2 0 1 4 2 0 1 5 2 0 1 6 2 0 1 7 • [BPM2010, Richardson]: BPM vs master data dichotomy • Data+Process integration key to: - assess value of processes and evaluate KPIs [Meyer et al, 2011] - aggregate relevant info, elicit business rules [ABDIS11, Dumas] • [Reichert, 2012]: “Process and data are just two sides of the same coin”
29. 29 Before moving to exotic models…
30. 30 How do contemporary activity-centric BPMSs account for the process-data interplay?
31. Example: BizAgi (~) 31 Review Request Fill Reim- bursement Review Reim- bursement Rejected Accepted
32. Case and Persistent Data Review Request Fill Reim- bursement Review Reim- bursement Rejected Accepted req info result reimbursement personal info 32
33. Persistent Data Engineering persistent storage33 Review Request Fill Reim- bursement Review Reim- bursement Rejected Accepted req info result reimbursement personal info framework data model custom code
34. Case Data Engineering persistent storage34 Review Request Fill Reim- bursement Review Reim- bursement Rejected Accepted req info result reimbursement personal info framework data model custom code user forms external services
35. 35 persistent storage Review Request Fill Reim- bursement Review Reim- bursement Rejected Accepted req info result reimbursement personal info framework data model custom code user forms external services Decision Modeling 35
36. A General Recipe • Explicit control-flow • Local, case data • Global, persistent data • Queries/updates on the persistent data • External inputs • Internal generation of fresh IDs 36 “REAL” PROCESS
37. Cooking with Standard Process Languages • Explicit control-flow • Local, case data • Global, persistent data • Queries/updates on the persistent data • External inputs • Internal generation of fresh IDs 37 BPMN ~ ~
38. Business Process A set of logically related tasks performed to achieve a defined business outcome for a particular customer or market. (Davenport, 1992) A collection of activities that take one or more kinds of input and create an output that is of value to the customer. (Hammer & Champy, 1993) A set of activities performed in coordination in an organizational and technical environment. These activities jointly realize a business goal. (Weske, 2011) 38
39. Business Process A set of logically related tasks performed to achieve a defined business outcome for a particular customer or market. (Davenport, 1992) A collection of activities that take one or more kinds of input and create an output that is of value to the customer. (Hammer & Champy, 1993) A set of activities performed in coordination in an organizational and technical environment. These activities jointly realize a business goal. (Weske, 2011) 39 Task logic: tightly intertwined with data updates!
40. 40
41. 41 data-centric … … … activity-centric 1 9 9 8 … 2 0 0 3 2 0 0 4 2 0 0 5 2 0 0 6 2 0 0 7 2 0 0 8 2 0 0 9 2 0 1 0 2 0 1 1 2 0 1 2 2 0 1 3 2 0 1 4 2 0 1 5 2 0 1 6 2 0 1 7 [IBM J., Nigam and Caswell] Business Artifacts [OTM08, Hull] Survey on business artifacts [WSFM10, Hull et al.] First paper on IBM GSM First draft of OMG CMMN Kick-off of the EU Project ACSI [BPM09WS, Kūnzle and Reichert] First paper on Philharmonic Flows [BPM16Forum, Hewelt and Weske] First paper on Chimera [BPM10WS, Estanol et al] First paper on BAUML [CAiSE17, De Giacomo et al] BPMN with data [TMIS16, Sun et al] Universal Artifacts
42. Business Entities/Artifacts Data-centric paradigm for process modeling • First: elicitation of relevant business entities that are evolved within given organizational boundaries • Then: definition of the lifecycle of such entities, and how tasks trigger the progression within the lifecycle • Active research area, with concrete languages (e.g., IBM GSM, OMG CMMN) • Cf. EU project ACSI (completed) 42
43. Finite-State Machines 43 d by the information model? t? ds). me ontology language.
44. Synchronization 44 FOL(R) rs(Q) ! ubstitu- uctively: where i : 1 i a. 2. e follows: u) = e u. O↵er Booking newO avail booking closedonhold drafty canceled subm finalized tbi accepted closeO newB resume addP submit checkP detProp reject cancel accept1 accept2 reject confirm
45. GSM - CMMN 45 Guard Stage Milestone Case Management Model and Notation
47. Chimera 47 Mathias Weske – Novel Challenges in BPM Research Mathias Weske – Novel Challenges in BPM Research 11 Object Lifecycles Mathias Weske – Novel Challenges in BPM Research 12 Process Fragments Mathias Weske – Novel Challenges in BPM Research Process Fragments Process Fragments
48. Cooking with Business Entities • Explicit control-flow • Local, case data • Global, persistent data • Queries/updates on the persistent data • External inputs • Internal generation of fresh IDs 48 ARTIFACT-/OBJECT-CENTRIC PROCESSES ~ ~ ~ ~
49. 49 Back to the roots…
50. 50 data-centric … … … activity-centric 1 9 9 8 … 2 0 0 3 2 0 0 4 2 0 0 5 2 0 0 6 2 0 0 7 2 0 0 8 2 0 0 9 2 0 1 0 2 0 1 1 2 0 1 2 2 0 1 3 2 0 1 4 2 0 1 5 2 0 1 6 2 0 1 7 [ICATPN07, Lazic et al.] Data Nets [CAiSE10, Sidorova et al.] Conceptual nets [TCS11, Rosa-Velardo and de Frutos-Escrig] ν-PNs (nets managing names) [PN16, Lasota] Survey on PNs with data [PN15, Triebel and Sürmeli] Algebraic PNs [ToPNoC17,_] DB-Nets (CPNs + DBs) [AAAI17, _] RAW-SYS (Workflow nets + DBs) [BPM2013, De Leoni and van der Aalst] DPNs [FAOC16, _] Verification of PNs with names
51. Colored Petri Nets 51 80 4 Formal Definition of Non-hierarchical Coloured Petri Nets k if n=k then k+1 else k k data n n if success then 1`n else empty n if n=k then k+1 else k (n,d)(n,d) n if n=k then data^d else data (n,d) if success then 1`(n,d) else empty (n,d) Receive Ack Transmit Ack Receive Packet Transmit Packet Send Packet NextRec 1`1 NO C NO D NO A NOxDATA NextSend 1`1 NO Data Received 1`"" DATA B NOxDATA Packets To Send AllPackets NOxDATA 11`3 4 1`(1,"COL")++ 2`(2,"OUR")++ 1`(3,"ED ") 1 1`3 11`"COLOUR" 6 1`(1,"COL")++ 3`(2,"OUR")++ 2`(3,"ED ") 6 1`(1,"COL")++ 1`(2,"OUR")++ 1`(3,"ED ")++ 1`(4,"PET")++ 1`(5,"RI ")++ 1`(6,"NET") Fig. 4.1 Example used to illustrate the formal definitions colset NO = int; colset DATA = string; colset NOxDATA = product NO * DATA; colset BOOL = bool; No conceptual representation of persistent storage
52. Recipe? • Explicit control-flow • Local, case data • Global, persistent data • Queries/updates on the persistent data • External inputs • Internal generation of fresh IDs 52 COLORED PETRI NETS implicit, or using fresh variables
53. Verifiability as a requirement
54. 54 data-centric … … … activity-centric 1 9 9 8 … 2 0 0 3 2 0 0 4 2 0 0 5 2 0 0 6 2 0 0 7 2 0 0 8 2 0 0 9 2 0 1 0 2 0 1 1 2 0 1 2 2 0 1 3 2 0 1 4 2 0 1 5 2 0 1 6 2 0 1 7 [PODS98, Abiteboul et al.] Relational Transducers [ICDT09, Vianu] Verification of artifact-centric processes [ICDT05, Vardi] Model checking for database theoreticians [ECAI12, _] Knowledge and action bases [PODS13, _] Data-Centric Dynamic Systems [STTT16, _] Case-centric DCDS [PODS13, _] Verification of data-centric processes [PODS13, Bojanczyk et al.] Verification via amalgamation [AIJ16, De Giacomo et al.] Bounded SitCalc Action Theories [I&C17, _] FO μ-Calculus over Generic Transition Systems [PODS16, _] Verification via under approximation
55. Formal Verification Automated analysis of a formal model of the system against a property of interest, considering all possible system behaviors 55 picture by Wil van der Aalst
56. Formal Verification The Conventional, Propositional Case Process control-flow (Un)desired property 56 Abstract model underlying variants of artifact-centric systems. Semantically equivalent to the most expressive models for business proc systems (e.g., GSM). Data Process Data+Process Data Layer: Relational databases / ontologies Data schema, specifying constraints on the allowed states Data instance: state of the DCDS Process Layer: key elements are Atomic actions Condition-action-rules for application of actions Service calls: communication with external environment, new data! alvanese (FUB) Foundations of Data-Aware Process Analysis INRIA Saclay Paris – 18/3/2016
57. (Un)desired property Finite-state transition system Propositional temporal formula|= Formal Verification The Conventional, Propositional Case Process control-flow 57 Abstract model underlying variants of artifact-centric systems. Semantically equivalent to the most expressive models for business proc systems (e.g., GSM). Data Process Data+Process Data Layer: Relational databases / ontologies Data schema, specifying constraints on the allowed states Data instance: state of the DCDS Process Layer: key elements are Atomic actions Condition-action-rules for application of actions Service calls: communication with external environment, new data! alvanese (FUB) Foundations of Data-Aware Process Analysis INRIA Saclay Paris – 18/3/2016
58. (Un)desired property Finite-state transition system Propositional temporal formula|= Formal Verification The Conventional, Propositional Case Process control-flow 58 Verification via model checking 2007 Turing award: Clarke, Emerson, Sifakis Abstract model underlying variants of artifact-centric systems. Semantically equivalent to the most expressive models for business proc systems (e.g., GSM). Data Process Data+Process Data Layer: Relational databases / ontologies Data schema, specifying constraints on the allowed states Data instance: state of the DCDS Process Layer: key elements are Atomic actions Condition-action-rules for application of actions Service calls: communication with external environment, new data! alvanese (FUB) Foundations of Data-Aware Process Analysis INRIA Saclay Paris – 18/3/2016
59. (Un)desired property Formal Verification The Data-Aware Case 59 Data-aware process el underlying variants of artifact-centric systems. quivalent to the most expressive models for business process GSM). Data Process Data+Process elational databases / ontologies ma, specifying constraints on the allowed states nce: state of the DCDS key elements are tions action-rules for application of actions alls: communication with external environment, new data! Foundations of Data-Aware Process Analysis INRIA Saclay Paris – 18/3/2016 (24/1)
60. (Un)desired property First-order temporal formula|= Formal Verification The Data-Aware Case Infinite-state, relational transition system [Vardi 2005] 60 el underlying variants of artifact-centric systems. quivalent to the most expressive models for business process GSM). Data Process Data+Process elational databases / ontologies ma, specifying constraints on the allowed states nce: state of the DCDS key elements are tions action-rules for application of actions alls: communication with external environment, new data! Foundations of Data-Aware Process Analysis INRIA Saclay Paris – 18/3/2016 (24/1) Data-aware process
61. (Un)desired property First-order temporal formula|= ? Formal Verification The Data-Aware Case 61 Infinite-state, relational transition system [Vardi 2005] el underlying variants of artifact-centric systems. quivalent to the most expressive models for business process GSM). Data Process Data+Process elational databases / ontologies ma, specifying constraints on the allowed states nce: state of the DCDS key elements are tions action-rules for application of actions alls: communication with external environment, new data! Foundations of Data-Aware Process Analysis INRIA Saclay Paris – 18/3/2016 (24/1) Data-aware process
62. Why FO Temporal Logics • To inspect data: FO queries • To capture system dynamics: temporal modalities • To track the evolution of objects: FO quantification across states • Example: It is always the case that every order is eventually either cancelled, or paid and then delivered • N.B.: the interplay between FO quantification and temporal modalities is quite subtle! 62
64. Dimension 1 Static Information Model How are data structured? • Propositional symbols —> Finite state system • Fixed number of values from an unbounded domain • Full-fledged database: • relational database • tree-structured data, XML • graph-structured data 64
65. Dimension 1 Static Information Model Are constraints present? How are they interpreted? • Complete data • Data under incomplete information • ontology (with intensional part typically fixed) • full-fledged ontology-based data access system • Hard vs soft-constraints (inconsistency-tolerance) 65
66. Dimension 2 Dynamic Component • Implicit representation of time vs. implicit progression mechanism vs. explicit process • When an explicit process is present: • how is the process dynamics represented? • procedural vs. declarative approaches (e.g., finite state machines vs. rule-based) • Deterministic vs. non-deterministic behaviour • Linear time vs. branching time model • Finite vs. infinite traces 66
67. Dimension 3 Data-Process Interaction How are data manipulated by the process? • Data is only accessed, but not modified • Data are updated, but no new values are inserted • Full-fledged combination of the temporal and structural dimensions • Hybrid approaches (e.g., read-only database + read- write registers) 67
68. Dimension 4 Interaction with the Environment Is the system interacting with the external world? • Closed systems vs. bounded input vs. unbounded input • Synchronous vs. asynchronous communication • Message passing, possibly with queues • One-way or two-way service calls 68
69. Dimension 4 Interaction with the Environment Which parts of the environment are fixed? Which change? • Stateless vs stateful environment • Fixed database vs. varying database vs. varying portion of data • Multiple devices/agents interacting with each other • Fixed vs changing topologies 69
70. Dimension 5 Formal Analysis How are (un)desired properties formulated? • Analysis of fundamental properties: reachability, absence of deadlock, boundedness, (weak) soundness • Analysis of arbitrary formulae in some temporal logic • Analysis of properties with queries across the temporal dimension (in the style of temporal DBs) 70
71. Dimension 5 Formal Analysis Which forms of analysis? • Verification • Dominance, simulation, equivalence • Synthesis from a given specification • Composition of available components 71
72. 72 1) Go to the essential 2) Find boundaries of decidability in a general setting 3) Understand the connection with concrete languages 4) Implement
73. 73