High Reliability and the Management of Critical Infrastructures (original) (raw)

High Reliability and the Management of Critical Infrastructures

Paul Schulman*, Emery Roe**, Michel van Eeten***, Mark de Bruijne*****

Organisation theorists and practitioners alike have become greatly interested in high reliability in the management of large hazardous technical systems and society’s critical service infrastructures. But much of the reliability analysis is centred in particular organisations that have command and control over their technical cores. Many technical systems, including electricity generation, water, telecommunications and other “critical infrastructures,” are not the exclusive domain of single organisations. Our essay is organised around the following research question: How do organisations, many with competing, if not conflicting goals and interests, provide highly reliable service in the absence of ongoing command and control and in the presence of rapidly changing task environments with highly consequential hazards? We analyse electricity restructuring in California as a specific case. Our conclusions have surprising and important implications both for high reliability theory and for the future management of critical infrastructures organised around large technical systems.

Introduction

Interest among theorists in organisational reliability-the ability of organisations to manage hazardous technical systems safely and without serious error-has grown dramatically in recent years (LaPorte and Consolini, 1991; Rochlin and van Meier, 1994; Schulman, 1993; Perrow, 1999; Roberts, 1993; Sagan, 1993; Sonne, 2000; Weick, Sutcliffe and Obstfeldt, 1996; Bemish, 2002; Evan and Manion, 2002). More recently, interest has heightened over “critical infrastructures” and their reliability in the face of potential terrorist attack. Assault on any of the critical infrastructures as power, water, telecommunications, and financial services, entails great consequences for their users as well as the other interdependent critical infrastructures (Mann 2002; HomerDixon 2002).

A momentous debate is taking place among policy and management experts, about how best to protect critical infrastructures against attack. What are their key vulnerabilities? To what extent should the operation of critical services be centralised or decentralised? In a report released by the United States National Academies of Science and Engineering and Institute of Medicine, it is argued that for a variety of infrastructures, including energy, transportation, information technology and health care, “interconnectedness within and across systems … means that [the] infrastructures are vulnerable to local disruptions, which could lead to widespread or catastrophic failures” (NRC, 2002).

The highly reliable management of large-scale critical service systems presents a major paradox: Demands for ever higher reliability, even against terrorist assault, surround these services as we grow more dependent on them, yet at the same time, conventional organisational frameworks associated with high reliability are being dismantled. Deregulation in air traffic, electricity and telecommunications has led to the unbundling and break-up of utilities and other organisations operating these systems. Increasing environmental requirements have brought new units and conflicting mandates into the management of large technical water systems (Roe and Van Eeten 2002). Dominant theories would predict that high reliability is unlikely or at great risk in the rapidly changing systems. In particular, high reliability in providing critical services has become a process that is achieved across organisations rather than a trait of any one organisation. Critical infrastructures in communication, transportation, and water resources all display structurally and geographically diverse constituent elements. The inputs and ‘disturbances’ to which they are subject are also diverse and place them under increasing pressure of fragmentation (e.g., deregulation and regulatory turmoil). Yet they are mandated to provide not just critical services to society, but reliable critical services, notwithstanding their turbulent task environments.

Examples of successful ‘high reliability management’ continue to come forward: mobile telecom services are becoming reliable across evermore complex hardware and service demands; during the California electricity crisis, the lights by and large stayed on; Y2K passed without major incident; fairly rapid recovery in financial services from 9/11 was secured; and large hydropower systems reconcile hourly the conflicting reliability mandates across multiple users of water.

How to explain such successful management under conditions where theory tells us to expect otherwise? By way of explanation we present a case study on how reliability of critical services was maintained in the restructured California electricity sector and the 2000 - 2001 electricity crisis. This case study has been chosen for two reasons.

First, both Normal Accident Theory (NAT) (Perrow, 1998) and most of the earlier High Reliability Organisations (HRO) research would predict that the restructuring of the California electricity sector should have

demonstrably undermined the reliable provision of electricity. For its part, NAT would see the coupling of the state’s two major electricity grids into a newly configured grid, with altogether novel and higher flows of energy, as an increase in the technology’s tight coupling and complex interactivity. The probability of cascading failures would increase accordingly. For its part, the earlier HRO research would have concluded similarly. Since electricity HROs, especially at nuclear power plants, must stabilise both inputs and outputs, reliability would necessarily suffer to the extent that this stability was thrown into doubt (Schulman 1993a). Indeed, subsequent events in California seemed to confirm both theories, as the chief feature of the California electricity crisis has been taken by many to be the unreliability of electricity provision.

However, notwithstanding the popular view of rolling blackouts sweeping California during its electricity crisis, in aggregate terms-both in hours and megawatts (MW)—blackouts were minimal. While no figures are available for the baseline before the crisis, it is important to note that rolling blackouts occurred on six days during 2001, accounting for no more than 30 hours. Load shedding ranged from 300 to 1000 MW in a state whose total daily load averaged in the upper 20,000 to lower 30,000MW. The aggregate amount of megawatts that was actually shed during these rolling blackouts amounted to slightly more than 14,000 megawatt-hours (MWh), the rough equivalent of 10.5 million homes being out of power for one hour in a state having some 11.5 million households-with business and other non-residential customers remaining unaffected. In short, the California electricity crisis had the effect of less than an hour’s worth of outage for every household in the state in 2001. Why the lights by and large actually stayed on-and reliably stayed on-was due, we argue, to factors related to a special type of reliability management.

The second reason for focusing on the California case study is that it offers a challenge to the NAT notion of ‘complex interactivity and tight coupling’ as an inevitable source of normal accidents in large technical systems. At least in the case of the electricity critical infrastructure, complexity and tight-coupling actually can be important resources for high reliability management.

Reliable Provision of Electricity Under the California Restructuring: Evidence from the 2000-2001 California Electricity Crisis.

In 1996 California adopted a major restructuring of its system of electricity generation, transmission and distribution. Through legislation, the state moved from a set of large integrated utilities that owned and operated the generation facilities, the transmission lines, the distribution and billing systems and set retail prices under a cost-based regulatory system, to a market-based system consisting of independent generators who sold their power on wholesale markets to distributors who sell it to retail customers. The major utilities were compelled to sell off their major generating capacity (except for nuclear and hydro sources) and to place their transmission lines under the control of a new organisation, the California Independent System Operator (ISO), which assumed responsibility for managing a new state-wide high voltage electrical grid formed by the merger of two separate grids formerly owned and managed by the two utilities Pacific Gas & Electric (PG&E) in the north and Southern California Edison (SCE) in the south.

The restructuring created a new set of institutions and relationships, many without precedent in the experience or culture of electric power provision. A fuller and more detailed history and description of these changes and relationships are beyond the scope of this paper and can be found in Roe et al. (Roe et al., 2002). Here we present the specific findings on the reliability of electricity provision under performance conditions arising out of California’s electricity restructuring that relate most directly to the different organisations charged with the actual provision of reliable electricity under the restructured conditions, namely the ISO, private generators, and distribution utilities, each with competing goals arising out of the deregulation-based restructuring.

Research Methods

Our research team relied on multiple methods, documents and key informants to identify and crosscheck our findings. A two-phased study for this research was adopted. In early 1999, we reviewed the literature on deregulation of the energy sector, with special reference to California’s electricity restructuring. During this initial phase, we identified the California Independent System Operator as the focal organisation for primary research. We approached key officials there and received permission to interview staff in and around the ISO main control room. Key informant interviews proceeded by the snowballing technique, in which new interviewees were identified as “people we should talk to” by previous interviewees, until we reached a point where new interviewees were mentioning the same or similar problems and issues heard in previous interviews. An interview questionnaire was used throughout (allowing open-ended responses and follow-up), with face-to-face interviews

typically lasting an hour or more. A major portion of the research is from direct observation, watching what control room operators and engineers did in their jobs through many shifts in performance conditions.

We undertook the bulk of investigations between April and December 2001. Sixty interviewees were identified and interviewed: thirty-three in and around the main control room of the ISO; eight at PG&E (e.g., in and around their Transmission Operations Centre and the Operations Engineering Units); five with a large generation supplier (a senior generation official and control room operators in one of its large California plants) and a private market energy trading dot com; and fourteen others located in the Governor’s Office, California Public Utilities Commission, California Energy Commission, Electric Power Research Institute, Lawrence Berkeley National Laboratory, University of California, Berkeley, and Stanford. We were also able to observe control room behaviour at a high point of the electricity crisis in April-May 2001.

Our interviews and observations focused on control rooms. Earlier HRO research as well as more recent investigations (Van Eeten and Roe, 2002) found control rooms to be the one place where HRO features were visible across a wide range of behaviour, namely technical competence, complex activities, high performance at peak levels, search for improvements, teamwork, pressures for safety, multiple (redundant) sources of information and cross-checks, and a culture of reliability, all working through tightly coupled, sophisticated technologies and systems.

Our focus on the control rooms of the ISO, PG&E, and the unnamed private generator also allows us to address the issue of tight coupling and complex interactivity directly. The most remarkable feature we observed was that there was not one operator in the ISO control room who was not tightly linked to the outside through multiple communications and feedback systems. Everyone, all the time, used the telephone; pagers were continually beeping; computers inside the control rooms “talked to” external computers; the Automatic Generation Control (AGC) system connects the ISO generation dispatcher directly to privately-held generators; the Automatic Dispatch System (ADS) connected the dispatcher directly to the bidder of electricity; dynamic scheduling systems in the ISO controls out-of-state generators; governors on generators automatically bring frequency back into line; the frequency and Area Control Error (ACE) measurements reflect real-time electricity usage across the grid; all kinds of telemetry measurements come back to the control room in real time; web pages used by the ISO, PG&E and private generators carried real-time prices and information; an operator in the ISO control room uses software to make the time error correction for the entire grid; and so on.

We used this tight-coupling and potential for complex interactions as the basis for our definition of the “high reliability network” of control rooms in the ISO, distribution utilities and private generators responsible for the direct provision of reliable electricity. “Networks” mean many things to many people, but for us California’s “high reliability network” has a very specific meaning: It is the control room operators and staff connected to each other through direct phone lines and speed dial monitors; those quintessential examples of “always on, always available” communications with the greatest potential for operator and technological “error.” For example, the ISO’s generation dispatcher had a touch monitor speed dial that connected the operator directly to others in the distribution utilities and private generators. One connection was to the PG&E’s control room for its own market trading activities. The PG&E control room operator, in turn, had speed dial contact with the ISO generation dispatcher. Some of the plants the PG&E control room used to contact directly are now owned by private energy suppliers. In our interview with plant operators in the one privately-owned plant, an operator showed us his direct lines, particularly one to the company’s trading floor and traders. These links come full circle in that, while the ISO generation dispatcher did not have a direct line to the plant we visited, that operator did have one to that supplier’s trading floor just as does the operator in the plant’s control room. In addition, we suspected that the ISO generation dispatcher might have been calling plant control operators informally during extreme peak demand days. Phone calls by operators who are wired to each other through the other sophisticated technologies and software just identified are the life blood of California’s high reliability network for electricity provision.

Research Framework

Figure 1 offers a closer view of the California High Reliability Network (HRN) at the time of research, and as defined above.

[Insert Figure 1]

Six nodes of activity in the HRN were unbundled as a result of electricity restructuring-from right to left in Figure 1: the three nodes of generation, transmission and distribution and within each node both a market and technology sub-node. The transmission node is run by the market and high-voltage grid transmission staff located in the ISO: its market desk co-ordinators and the Generation Dispatcher (GD) along with his/her grid support staff in the control room. The distribution node is represented by a utility, PG&E, in its market and lower-voltage grid

distribution staff, particularly at the time of writing its trading unit (Electricity Portfolio and Operations Services, EPOS) and its distribution control room (Transmission Operations Center, TOC). There are also other distribution units, such as SCE. The generation node has both a market or trading floor and a plant generation or technology subnode.

Restructuring coupled the three major nodes in two ways: through markets and through the grid and its support technology. In principle, markets were to be the main co-ordinating mechanisms for grid operations-that is, market transactions result in the electricity schedules that are to be the basis of grid operations. In terms of Figure 1, PG&E’s TOC, the ISO’s generation dispatcher, and the generation plant are organised around the grid, while PG&E’s EPOS, the ISO’s markets, and private generator’s trading floor were organised around market operations. Reflecting this functional interconnection, market people tended to talk to market people, just as grid people tended to talk to grid people across the HRN’s three unbundled nodes.

Many relationships are possible between and among the three nodes and their market and grid subnodes. In practice, however, formal relations are circumscribed by objective, mandate, legislation and regulation. Within each node, market and grid operations are to be separate: PG&E’s EPOS was not meant to be in communication with TOC, generation plant staff were meant to have final authority over unscheduled outages, not the trading floor, and the ISO’s ability to undertake outage co-ordination activities with the generators was highly restricted by what types of information either node can obtain from the other. Informal communications outside formally restricted channels, however, continued to be important for ensuring the reliability of electricity in volatile situations, requiring a bridge between profit motive on the part of the generator or trading unit and the reliability mandate on the part of the transmission control room.

Because unstable (that is, unpredictable or uncontrollable) situations increase the pressure to communicate between units, especially units within the same organisation, an extensive set of support staff has grown up around the market and grid operations rooms of each unbundled node and its market and technology subnodes. We call this supporting staff the “wraparound”; in the ISO’s case, the staff supporting market and grid operations are literally circled around the ISO’s control room. We have found these wraparound supporting control room operations in other critical infrastructures as well, such as with water and telecommunications. A wraparound is the site of formal and informal communication with counterparts in the other wraparounds, especially during crisis situations.

The respective control room and the wraparound form an organisational infrastructure around the market and the technology operations, which we term the market matrix and technology matrix. Each matrix connects the market or the technology subnodes across the three nodes of generation, transmission and distribution. In Figure 1, the horizontal ‘market’ bar represents the flow of market transactions; the horizontal ‘technology’ bar represents the flow of electricity on the grid.

To a degree, the market and grid flows were also separated in the old utilities. The important difference between the California HRN and the older integrated utilities, however, is that the markets, under restructuring, were supposed to co-ordinate across the matrices, with the focal organisation for co-ordination being the ISO. In reality, extensive organisational interactions between and among control rooms and wraparounds are needed so that markets can operate, to connect market transactions to the physical properties of grid and, ultimately, to avoid grid collapse and decomposition or “islanding”. Both the technology and market matrices are “opened” at their ends, as the California grid is itself part of the Western grid and the “California” electricity markets are themselves already globalised in important respects, e.g., Mirant Energy Corporation was not only a California but an international energy supplier at the time of our research. This openness, combined with the divergent interests of the generators, utilities and the ISO, define the “open systems” feature of the HRN.

Research Findings

Our research question was: How did this tightly coupled, highly interactive “network” of control rooms, that operated within organisations and systems having different mandates and interests, actually ensure the provision of reliable electricity during the California electricity crisis? Our answer: The focal organisation, the ISO, balances load and generation in real time (that is, in the current hour or for the hour ahead) by developing and maintaining a repertoire of responses and options in the face of unpredictable or uncontrollable system instability produced either within the network (e.g., by generators acting in a strategic fashion) and from outside the network through its open system features (e.g., temperatures and climate change). ‘Load’ is the demand for electricity and ‘generation’ is the electricity to meet that load, both of which must be balanced (i.e., made equal to each other) within proscribed periods of time, or otherwise service delivery is interrupted as the grid physically fails or collapses. We call this need to balance load and generation along with meeting other related limits, the “reliability requirement” of the ISO control room operators.

In meeting the reliability requirement the ISO generation dispatcher, commonly known as the gen-dispatcher, manages the grid in real time by estimating how much of an increase or decrease in energy is needed to control the Area Control Error (ACE), which shows the relative balance between generation and load in California’s grid. Maximum fluctuations in the ACE are set by the wider reliability criteria of the Western Electricity Co-ordinating Council (WECC), which sets standards of operation for that Western region. It is the task of the generation dispatchers to keep the imbalances of the ISO’s grid within these bandwidths. How well the generation dispatcher does his or her work determines the number of violations of control performance standards (CPS) and disturbance control standards (DCS) the ISO faces. Such control area reliability and performance standards are set by the WECC and the North American Electricity Reliability Council (NERC).

It is within these constraints that the generation dispatcher tries to control the grid. To keep the system within the parameters, the generation dispatcher closely watches the frequency and ACE-trends, the output of all power plants and some vital path indicators on his monitor to determine the stability state and “movement” of the grid on the ACE and frequency parameters. (The frequency standard within the U.S. has been set to 60.0000 Hertz and is an inherent characteristic of the stability of the grid.) To adjust frequency and ACE, the generation dispatcher can order contracted power plants to increase or decrease their electricity output. But it takes time before the actual dispatching order of the generation dispatcher and the effects of his or her orders can be noted in the grid behaviour. Power plants increase or decrease their power output rather slowly and it takes some time before the accumulative effects of these changes takes effect in the ACE and frequency rate.

With that background in mind, our research led us to focus on the match between, on the one hand, the options and strategies within the HRN to achieve its reliability requirement (namely, balancing load and generation, staying within standards set for key transmission lines or “paths”, while meeting the other parameter constraints) and, on the other hand, the unpredictable or uncontrollable threats to fulfilling the reliability requirement. A match results from having at least one option sufficient to meet the requirement under given conditions. At any point, there is the possibility of a mismatch between the system variables that must be managed to achieve the reliability requirement and the options and strategies available for managing those variables.

The match between option and system requirements can be visualised in the following way. Within the HRN, the core reliability tasks are located in the focal organisation, the ISO’s main control room. It is the only unit that simultaneously has the two reliability mandates: keeping the flow and protecting the grid. Meeting the dual reliability mandate involves managing the options and strategies that co-ordinate actions of the independent generators, energy traders and the distribution utilities in the HRN. As the focal organisation, the options the ISO control room deploys are HRN-wide options, e.g., outage co-ordination is the responsibility of the ISO, but involves the other partners in the high reliability network.

In other words, the ISO control management can be categorised in terms of the variety of HRN-based options it, the ISO, has available (high or low) and the instability of the California electricity system (high or low) as set out in Figure 2.

[Insert Figure 2]

Instability is the extent to which the focal control room in the ISO faces rapid, uncontrollable changes or unpredictable conditions that threaten the grid and service reliability of electricity supply, i.e., that threaten the task of balancing load and generation. Some days are those of low instability, fondly called “normal days” in the past. A clear example of high instability are the days for which a large part of the forecasted load had not been scheduled through the day ahead market, which means to the ISO that actual flows are unpredictable and congestion will have to be dealt with at the last minute. Additionally, any loss of transmission or generating capacity can introduce instability into the system.

Options variety is the amount of HRN resources, including strategies, available to the ISO control room to respond to events in the system in order to keep load and generation balanced at any specific point in time. It includes the available operating reserves and other generation capacity, the available transmission capacity and the degree of congestion. High option variety means, for instance, that the ISO has available to it a range of resources and can operate well within required regulatory required conditions. Low options variety means the resources are below requirements and, ultimately, that very few resources are left and the ISO must operate close to, or even in violation of, some regulatory margins.

These two dimensions together set the conditions under which the ISO control room has to pursue its high reliability management. They demand and we observed four different performance modes for achieving reliability (i.e., balancing load and generation) which we term: just-in-case, just-in-time, just-for-now, and just-this-way. “Low” and “high” are obviously imprecise terms, though they are the terms used and commonly recognised by many of our ISO interviewees. In practice, the system instability and options variety dimensions should be better

thought of as continua without rigid high/low cut-off points. Let us turn now to a brief description of each performance mode.

Just-in-case performance, redundancy and maximum equifinality

When options are high and instability low, just-in-case performance is dominant in the form of high redundancy. Reserves available to the ISO control room operators are large, excess plant capacity (the bête noire of so many deregulation economists) exists at the generator level, and the distribution lines are working with ample backups, all much as forecasted with little or no unpredictability and/or uncontrollability. More formally, redundancy is a state where the number of different but effective options to balance load and generation is high, relative to the market and technology requirements for balancing load and generation. There are, in other words, a number of different options and strategies to achieve the same balance. The state of high redundancy is best summed up as one of maximum equifinality, i.e., a multitude of means exist to meet the reliability requirement.

Just-in-time performance, real-time flexibility and adaptive equifinality

When options and instability are both high, just-in-time performance is dominant. Option variety to maintain load and generation remains high, but so is the instability of system variables, in both markets (e.g., rapid price fluctuations leading to unexpected strategic behaviour by market parties) and technology (e.g., sagging transmission lines during unexpectedly hot weather).

How does just-in-time performance work? Operators told us about days that started with major portions of the load still not scheduled and with the predictability of operations significantly diminished. Reliability becomes heavily dependent on the ISO control room’s ability to pull resources and the balance together up to the last minute. Because of the time pressure this brings with it, operators cannot rely completely on their highly specialised tasks and procedures, but initiate a great deal of lateral communication to quickly and constantly relay and adapt all kinds of information. We call this “keeping the bubble” with respect to the variables that need to be managed given the performance conditions they face. They not only have to respond quickly to unpredictable and uncontrollable events. They also have to ensure that their responses are based on understanding variables so that these responses do not exacerbate the balance problem, especially as confusion over what is actually happening can be intense at these times as well as the risk of cascading variables. It is no longer possible to separate beforehand important and unimportant information. People from the wraparound are pulled into operations and real time to extend the capability to process information quickly and synthesise it into a “bubble” of understanding the many more variables and complex interactions possible in just-in-time performance.

This performance condition demands “real-time” flexibility, that is, the ability to utilise and develop different options and strategies quickly in order to balance load and generation. Since operators in the control room are in constant communication with eachother and others in the HRN, options are reviewed and updated continually, and informal communications are much more frequent. Flexibility in real-time is the state where the operators are so focused on meeting the reliability requirement and the options to do so that more often than not they customise the match between them, i.e., the options are just enough and just-in-time. The fact that the instability is high focuses operator attention on exactly what needs to be addressed and clarifies the search for adequate options and strategies. What needs to get done gets done with what is at hand as it is needed.

More formally, the state of real-time flexibility is best summed up as adaptive equifinality: There are effective alternative options, many of which are developed or assembled as required to meet the reliability requirement. The increased instability in system behaviour is matched by the flexibility in the focal organisation in using network options and strategies for keeping performance within reliability tolerances and bandwidths. Substitutability of options and strategies is high for “just-in-time” performance, an immensely important point to which we return at the article’s end. As one ISO control room shift manager put it, “In this [control room] situation, there are more variables and more chances to come up with solutions.” “It’s so dynamic,” said one of the ISO’s market resource co-ordinators, “and there are so many possibilities . . . Things are always changing.”

Just-for-now performance and maximum potential for deviance amplification

When option variety is low but instability is high, just-for-now performance is dominant. Options to maintain load and generation have become visibly fewer and potentially insufficient relative to what is needed in order to balance load and generation. This state could result from various reasons related to the behaviour of the open system that is the California energy sector. Unexpected outages can occur, load may increase to the physical limits of transmission capacity; and the use of some options can preclude or exhaust other options, e.g., using stored hydro capacity now rather than later. In this case, unpredictability or uncontrollability has increased while the variety of effective options and strategies is diminished. Here, “crisis management” begins to come into play, e.g., the ISO’s declaration of “Stage 1” or “Stage 2” emergency (public alerts based upon reserve generation capacity

falling below seven and five percent of current load respectively) may compel an ISO senior manager to go outside official channels and call his counterpart at a private generator, who agrees to keep the unit online, “just for now.”

More formally, just-for-now performance is a state summed up as one of maximum potential for “deviance amplification”: Even small deviations in elements of the market, technology or other factors in the system can ramify widely throughout the system (Maruyama, 1968). Marginal changes can have maximum impact in threatening the reliability requirement, i.e., the loss of a low-megawatt generator can tip the system into blackouts. From the standpoint of reliability, this state is untenable. Here people have no delusions that they are in control. They understand how vulnerable the grid is, how limited the options are and how precarious the balance; they are keeping communication lines open to monitor the state of the network; and they are busily engaged in developing options and strategies to move out of this state. They are not panicking and, indeed, by prior design, they still retain the crucial option to reconfigure the electricity system itself, by declaring a Stage 3 emergency (see below).
“Just-for-now” performance is also very fast-paced and best summed up as “firefighting.” When options become few and room for manoeuvrability boxed in (e.g., when load continues to rise while new generation become much less assured and predictable), control operators become even more focused on the big threats to balancing load and generation. As options become depleted, wraparound staff in the control room come to have little more to add. There is less need for lateral, informal relations. Operators even walk away from their consoles and join the others in looking up at the big board on the side wall. “I’m all tapped out,” said the gen-dispatcher on the day we were there when the ISO just escaped issuing a Stage 3 declaration. Operators and support staff are waiting for new, vital information, because they are out of other options for controlling the ACE themselves.

Just-this-way performance, crisis management and zero equifinality

In this last performance mode for balancing load and generation, system instability is lowered to match low options variety and just-this-way performance is dominant. This performance state occurs in the California electricity system as a short-term “emergency” solution. In an electricity crisis, the option is to tamp down instability directly with the hammer of crisis controls and forced network reconfigurations. The ultimate instrument of crisis management strategy is the Stage 3 declaration, which requires interruption of firm load in order to bring back the balance of load and generation from the brink of just-for-now performance. The effect of a Stage 3 declaration is to reconfigure the grid into a system under command and control management. Load reductions can be ordered from major wholesale electricity distributors.

More formally, just-this-way performance is a state best summed up as one of zero equifinality: Whatever flexibility could be squeezed through the remaining option and strategies is forgone on behalf of maximum control of a single system variable, in this case load. The Stage 3 declaration has become both a necessary and sufficient condition for balancing load and generation, again by reducing load directly. This contrasts significantly with the other three performance conditions. There the options and strategies are sufficient, without being necessary. Under “just-this-way” performance conditions, the decision to shed load has been taken and now information is centred around compliance. The vertical relations and hierarchy of the control room extends into the HRN, even to the distribution utilities in their rotating outage blackouts. Formal rules and procedure move centre stage, including the declaring and ending the Stage 3 declaration. Here the HRN looks most like the structure it was before restructuring, but only because of the exceptional vulnerability of grid reliability to threats. Interruptions to service become paramount. With Stage 3, operators reassume their responsibilities and roles. Wraparound staff remain ready to help and back up, if and when needed.

The operational differences we observed in the ISO control room and the larger HRN under these four performance modes are summarised in Table 1.

[Insert table 1]

To conclude this section, note that one of the important features of the reliability management of the ISO within the HRN is the large proportion of that management that occurs in real-time, that is, under conditions of high system instability. Estimates given by ISO participants of time spent in just-in-time and just-for-now performance modes ranged from 75 to 85%85 \% in mid-2002, a year after the state’s famous electricity “crisis”. In April 2003, the percentage was some 60%60 \%, even after another year’s stabilisation efforts (including the must-offer requirements placed on generators and substantial new generation coming on line). This is a departure from the large preponderance of anticipatory, just-in-case management found in much of the earlier HRO research. Indeed, the California system cannot be reliable with respect to grid or service reliability without having the options to perform just-in-time or just-for-now. What this means is that the ISO control rooms and others in the HRN are engaged in a very different kind of reliability management from than found in much of earlier HROs.

Reliability Management Revisited

Our interviewees told us that there was not one official reliability standard that has not been pushed to its limits and beyond in the California electricity crisis. The emerging reliability criteria, standards and associated operational bandwidths have one common denominator: They reflect the effort by members of the HRN, particularly the ISO control room, to adapt reliability criteria to meet circumstances that they can actually manage, where those circumstances are increasingly real time in their urgency. What cannot be controlled “just-in-case” has to be managed “just-in-time;” if that does not work, performance has to be “just-for-now;” or, barring that, “just-this-way” by shedding load directly. In each instance, reliability becomes that bar that the operators can actually jump.

The standards at issue are many. Most important, operating reserve limits have not only been questioned, but the system has operated reliably-in particular, peak-load demands have been met continuously and safely-at lower reserve levels than officially mandated in WECC standards. CPS criteria have been disputed and efforts are underway to change them. There has been mounting pressure to justify empirically standards that were formulated ex cathedra in earlier periods and have since become “best operating practices.” One senior ISO engineer told us,
“Disturbance control standards (DCS) says that if I have a hit, a unit goes offline, ACE goes up, I have to recover within ten minutes. Theory was that during that time you were exposed to certain system problems. But who said 10 minutes? God? An engineering study? What are the odds that another unit will go offline? One in a million? So now with WECC we have turned the period into 15 minutes, because the chance of another [unit] going offline is low, as we know from study of historical records. . . .”

There is, in fact, a paradox between having reliability standards and having multiple ways to produce electricity reliably. On the one hand, the standards are operationalised (in terms of the WECC) and the fact that performance can be empirically gauged against these operational measures is the chief measuring stick of whether electricity is being provided reliably or not. On the other hand, the standards were everywhere being redefined in the crisis, when not questioned, because there are an increasing number of times when only by pushing the standards to their limits and sometimes beyond were the lights kept on. A senior manager in the ISO operations engineering unit, responsible for a large body of procedures, told us “part of the [control room] experience is to know when not to follow procedures…there are bad days when a procedure doesn’t cover it, and then you have to use your wits.”

Perhaps the best way to understand the distinctiveness of this approach to reliability is to compare and contrast the principal features of traditional HROs (such as the Diablo Canyon nuclear power plant) with the approach taken in the California high reliability network, both of which seek and achieve high reliability in connection with electricity,

[Insert table 2]

Table 2 can be seen as a multiple-dimensioned gradient in high reliability service provision between two approaches to reliability management of the tightly coupled, complexly interactive technologies in our electricity critical infrastructure. The dimensions that differ are telling. The differences between trial and error learning and redundancy are particularly important to understand.

The earlier literature on HROs found that they avoided anything like the large-scale experimentation we found in the California HRN. The improvisational and experimental are what we have summed up as “adaptive equifinality” in “just-in-time” performance. What was unacceptable in the HRO has become sine qua non for service and grid reliability in the HRN-but only in real time.

There were experiments on the large scale-that is, on the scale of the California grid as a whole-which were grid-wide because the ISO could not do otherwise. They were undertaken involuntarily, that is, the ISO had little choice, for example, in introducing “proxy marketing” software over the whole grid. Over the course of the day it was introduced, complaints were made that price information was wrong, numbers were not showing up, and the information wasn’t in real time.

Why experiment this way? Because the status quo had become untenable for the system operator, thanks to increased instabilities introduced into the electricity system through the restructuring-induced electricity crisis. Bids from market traders were not coming in and operators had to do something (i.e., we now know that some of the bids were being withheld for strategic reasons by the private energy traders). The “something” that the ISO did was to create proxy bids for the remainder of generation capacity not bid in by the energy traders. In such ways,

real-time operations and experiments become synonyms. Moreover, even involuntary experiments can become occasions for learning, much as near misses are in other sectors. The real-time experiment, deliberate or inadvertent, became a design probe from which operators can learn more about the limits of service and grid reliability. “You don’t learn as fast as you can, until you have to respond to something that requires fast response,” argued a senior control room manager at PG&E’s transmission operations centre.

The traditional HRO nuclear power plant would never undertake (or in all likelihood be allowed to undertake by regulatory agencies) such experiments, as it seeks both stable inputs (for example, the predictable quality of parts and supplies through “nuclear grade” regulations, isolation from grid dispatch systems, even controlling its surrounding environment through guns and guards) as well as stable outcomes. In the HRN’s case, there is no stable resting point for the grid and its demands, because there are few routines and operating procedures that can stabilise inputs (e.g. load, generation availability or grid conditions) in order to facilitate highly reliable outputs. One reason why the operations-as-experimentation can continue is that shedding load is a live option for maintaining real-time reliability. If “just-in-time” performance fails, operators can always shift into “just-for-now” or “just-this-way” performance modes.

NAT Revisited

What we see in the findings from the California HRN is that both complexity and tightly-coupled interactions can serve, and often do serve, as positive sources of high reliability performance. The complexity actually allows adaptive equifinality in just-in-time performance because many variables are in play, such that some combination of options can be stitched together at the last minute. The tight-coupling actually positions operators within the ISO where they can implement their reliability “solutions” as well as exercise the Stage 3 option of command and control.

An important aspect of this real-time reliability is substitutability. Substitutability is key for “just-in-time” performance, where system instabilities are compensated for by flexibility in responses. This view contrasts with that of Normal Accidents Theory. For NAT, the ability to substitute elements is a property of loosely coupled systems and linearly interactive ones. According to Perrow,
“What is true for buffers and redundancies is also true for substitutions of equipment, processes and personnel. Tightly coupled systems offer few occasions for such fortuitous substitutions; loosely coupled ones offer many.” (Perrow, 1999 (1984), 96).

Yet we found that the same substitutability is key to “just-in-time” performance, when the technology of the grid remains as tightly coupled as it has always been.

The primary reason why tight coupling and complexity were found to serve as a resource or options in the HRN case study but not in the earlier Diablo Canyon case lies in the differing roles paid by causal analysis. Nearcomplete causal analysis is central to the high reliability performance of traditional HROs; not so for the HRN just described. In the HRN, pattern recognition in meeting the reliability requirement is especially important. The urgency of real time makes it crucial to “read” feedback in terms of signature events that can guide the balancing of load and generation, in the absence of operators having to have full causal knowledge of the system in the process. This substitutability of reliability-enhancing signature events for complete causal understanding of the system is particularly important, as the earlier HRO research found nearly-complete causal understanding necessary for reliability, while its absence increases the risk of normal accidents.

None of this is to say that tight coupling and complex interactivity are not a problem for the electricity critical infrastructure as it currently exists. It is, but in a different way than NAT poses. There are indeed risks and dangers of errors arising because of misjudgement (important in “just-in-time” performance) and exhausting options without room to manoeuvre (important in “just-for-now” performance). We return to these situations in the concluding section of the paper.

The Role of Reliability Professionals

According to NAT, tightly coupled and complexly interactive technologies are particularly hazardous because each element summons up a contradictory management strategy. Tight coupling means, according to Perrow, that operators require centralisation of authority and operations, with unquestioned obedience and immediate response capability. Complex interactivity, in contrast, requires decentralisation of authority to cope with unplanned

interactions on the ground by those closest to the system. What this description highlights is, in our terms, the trade-off between resistance and resilience in management.

The question is: Who does this balancing between resistance and resilience? Who reconciles the need for anticipation and careful causal analysis with the need for flexibility and improvisation in the face of turbulent inputs into complex and tightly-coupled systems? In our research we found a crucial, if neglected role for middle level professionals-controllers, dispatchers, technical supervisors and department heads-in doing the balancing act so necessary to the real-time reliability of these networked systems. We term them “reliability professionals” in recognition of their overriding commitment to the real-time reliability of their systems, and the unique set of skills they bring to their tasks.

The quest for high reliability in tightly coupled, highly interactive critical infrastructures can be characterised briefly along two dimensions: (1) the type of knowledge brought to bear on efforts to make an operation or system reliable, and (2) the focus of attention or scope of these reliability efforts. The knowledge base from which reliability is pursued can range from formal or representational knowledge, in which key activities are understood through abstract principles and deductive models based upon these principles, to experience, based on informal or tacit understanding, generally derived from trial and error. At the same time, the scope of attention can range from a purview which embraces reliability as an entire system output, encompassing many variables and elements of a process associated with producing a stream of reliable results; to a case by case focus in which each case is viewed as a particular event with distinct properties or characteristics. These two continua of knowledge and scope define a conceptual and perceptual space within which reliability can be pursued. Figure 3 below illustrates the point.

[Insert Figure 3]

At the extreme of both scope and formal principles is the formal design approach to reliability. Here formal deductive principles are applied to understanding a wide variety of critical processes. It is considered inappropriate to operate beyond the design analysis, and that analysis is meant to cover an entire reliability system, including every last case to which that system can be subjected. The design in this sense is more than analysis, it is major control instrument for the behaviour of the system. This is the approach that dominates in a conventional HRO, such as a nuclear power plant, where operating “outside of analysis” is a major regulatory violation. At the other extreme is the activity of constant reactive behaviour in the face of case-by-case challenges. Here reliability resides in the reaction time of control room operators rather than the anticipation of system designers.

Both positions are, however, extremes. Each alone is insufficient as an approach to providing grid and service reliability, though each is necessary. Designers cannot foresee everything, and the more “complete” a logic of design principles attempts to be, the more likely it is that the full set will contain two or more principles which contradict each other. On the other hand, case-by-case reactions by their very nature are likely to give the operator too specific and individualised a picture, losing sight of the proverbial forest for the trees. Experience can become a “trained incapacity” that leads to actions undermining reliability because operators may not be aware of the wider ramifications of what they are doing.

What to do? First, “moving horizontally” across the reliability space directly from one corner across to the opposite corner (i.e., upper left to upper right, lower right to lower left) is unlikely to be successful. A great deal of reliability research supports the findings to the effect that attempts to impose large-scale formal designs directly onto an individual case-to attempt to anticipate and fully deduce and determine the behaviour of each instancesare freighted with risk (Turner 1978; Perrow 1983). Yet this is what system designers attempted at times to do to secure reliability in the California system-to, as one engineer described it, “design systems that are not only foolproof but damned fool-proof.”

At the same time, trying to account for an entire system on the basis of first-hand experiential knowledge can scarcely be successful either. Generalisations and some formalisation are necessary in order to store, transmit and apply the knowledge necessary to manage complex systems.

Instead of horizontal, corner-to-corner movements, Figure 3 indicates that reliability is enhanced when shifts in scope are accompanied by shifts in the knowledge base. Given the limitations of the extremes in this reliability space, it becomes important to operate in positions closer to a shared centre by (1) moving from reactions to building pattern recognitions and strategic adaptations and (2) by moving from designs to contingency planning and scenario building. It is difficult to tack to this middle ground, to combine macro and micro perspectives on a complex system and to bring together logic and experience and (perhaps even more difficult) theory and practice. It is, however, in this middle ground where doctrine is tempered by experience, discretion added to design, and

shared views reconciled with individualised perspectives. Here a high degree of improvisation and inventiveness is often in evidence.

The skill to pursue reliability from this centre ground seems to derive less from disciplines than from professions, less from specialised training than from careers that span a variety positions and perspectives. The middle ground, in a phrase, is the domain of the reliability professional. The importance of this middle-level reliability professional to the high reliability of complex technologies such as the California HRN could hardly be overstated. Yet they are typically neglected by system designers, regulators and the public alike. It is with this group and their professionalism where we believe the greatest gains in grid and service reliability are to be found, both practical and conceptual. The next step in developing high reliability theory should be in better understanding these professionals, their work and the cognitive skills they bring to bear on it.

More to the point, the stakes have never been higher in getting the argument right. We raise one final issue to underscore this point: the reliability of critical infrastructures in the face of terrorist assault.

Potential Security Implications

At issue for Making the Nation Safer and similar security analyses is the notion that critical infrastructures have discrete points of vulnerability or ‘choke points’, which, if subject to attack or failure, threaten the performance of the entire infrastructure. Key transmission lines upon which an electrical grid depends, or single security screening portals for airline passengers, or a single financial audit on which countless investment decisions are predicated are points of vulnerability which can lead to cascading failures throughout the whole system. The more central the choke point to the operation of the complex system, so this logic goes, the more dependent and susceptible the system is to sabotage or failure.

Currently, the dominant recommendation is to redesign these systems by decentralising them so as to render them less interdependent. Decentralise power grids and generators, making smaller, more self-contained transmission and distribution systems. A major assumption is that decentralisation, looser coupling and reductions in the scale of critical systems will reduce their vulnerability to terrorist attack. Indeed the existence of choke points, it has been argued, signals the vulnerability of these systems, both inviting and guiding terrorist attack.

But our research suggests that this perspective on vulnerability to terrorists should be considered more carefully. In fact, redesigns undertaken from the current perspective might undermine some of the very features that ensure the operational reliability of these systems in the first place. As we have just seen, reliability has been rooted in the processes that take place within the tightly coupled complexly interactive infrastructure-the behaviours, adaptations and innovations that take place when things do not go as anticipated. It is precisely this aspect of reliability-the skills, experience and knowledge of reliability professionals-that could be lost in a preoccupation with design-focused perspectives on security.

While the current view sees choke points as the problem and decentralisation or de-concentration as the answer, this is not at all clear from the perspective of reliability professionals. For them, tight-coupling can be a system resource as well as a source of vulnerability. Choke points are the places where terrorists will likely direct their attention, but they are also, from an operational standpoint, the places to which reliability professionals, with their trained competencies in anticipation and recovery, are most attentive.

Furthermore, from an operational standpoint in relation to terrorism, our research suggests that the prevailing view of reliability could be stood on its head. Complex, tightly-coupled systems convey reliability advantages to those trained professionals who seek to protect or restore them against terrorist assault. Their complexity allows for multiple strategies of resilience. Their tightly-coupled choke points allow these professionals to see the same portion of the system as the terrorists, and positions them to see it whole against a backdrop of alternatives, options, improvisations and counter-strategy.

In contrast, it is loosely coupled, decentralised systems which present many more independent targets to terrorists-they can strike anywhere and, while they may not bring down major portions of the grid, they can still score their points in the psychological game of terror and vulnerability. The local managers will probably not have a clear picture of what’s happening overall, nor will they have as wide a range of alternatives and recovery options available.

Whatever their future vulnerability, it seems clear that more and more critical infrastructures in our society will consist of high performance interdependent systems, across which highly reliable performance is an ongoing requirement. These sets of highly interdependent organisations seem to be consistently pushed to the edge of their design envelopes, under pressure to maximise, if not optimise, their performance. They must subsist in unstable economic, political and regulatory environments. Perhaps never before has our basic understanding-the “reliability” of reliability theory or the “normal” of normal accidents-been of such great social consequence.

References

Beamish, T.D., (2002), Silent Spill, Cambridge, M.I.T. Press, MA.
California Public Utilities Commission, (1993), California’s Electric Services Industry: Perspectives on the Past, Strategies for the Future, (“The Yellow Book”), CPUC, Sacramento, CA.
California Public Utilities Commission, (1996), Decision 95-12-063 as Modified by D.96-01-009, (“The Blue Book”), CPUC, Sacramento, CA.
Carroll, J.S., (1998), Organisational Learning Activities in High-Hazard Industries, Journal of Management Studies, Volume 35, pp. 699-717.
Evan, W.M. and Manion, M., (2002), Minding the Machines: Preventing Technological Disasters, Prentice-Hall, Saddle River, NJ.
Langer, E., (1989), Mindfulness, Addison-Wesley, New York.
LaPorte, T. and Consolini, P., (1991), Working In Practice But Not In Theory: Theoretical Challenges of High Reliability Organisations, Public Administration Research and Theory, Volume 1, pp. 19-47.
LaPorte, T., (1994), A Strawman Speaks Up: Comments on Limits of Safety, Journal of Contingencies and Crisis Management, Volume 2, pp. 207-211.
LaPorte, T., (1996), High Reliability Organisations: Unlikely, Demanding and At Risk, Journal of Contingencies and Crisis Management, Volume 4, pp. 60-71.
Misumi, J., Wilpert, B and Miller, R., (1999), Nuclear Safety: A Human Factors perspective, Taylor and Francis, London.
Morgan, G., (1997) Images of Organisation, Sage Publications, New York.
National Research Council, (2002), Making The Nation Safer: The Role of Science and Technology in Countering Terrorism, National Academy Press, Washington, D.C.
Perrow, C., (1983), The Organisational Context of Human Factors Engineering, Administrative Science Quarterly, Volume 28, pp. 521-541.
Perrow, C., (1979), Complex Organisations; A Critical Essay, Wadsworth, New York.
Perrow, C., (1999), Normal Accidents, Princeton University Press, Princeton.
Perrow, C., (1994), Review of S. D. Sagan, Limits of Safety, Journal of Contingencies and Crisis Management, Volume 2, pp. 212-220.
Reason, J., (1972), Human Error, Cambridge University Press, Cambridge.
Rijpma, J.A., (1997), Complexity, Tight Coupling and Reliability, Journal of Contingencies and Crisis Management, Volume 5, pp. 15-23.
Roberts, K., (1990), "Some Characteristics of One Type of High Reliability Organisation, Organisation Science, Volume 1, pp. 160-176.
Roberts, K., (1993), New Challenges To Understanding Organisations, Macmillan, New York.
Rochlin, G.I., (1993), Defining High Reliability Organisations in Practice, in Roberts, K. (ed.), New Challenges To Understanding Organisations, pp. 11-32, Macmillan, New York.
Rochlin, Meier, G. I. and A. von, (1994), Nuclear Power Operations; A Cross Cultural Perspective, Annual Review of Energy and Environment, Volume 19, pp. 153-187.
Roe, E.P and Eeten, M.J.G. van, (2002), Ecology, Engineering and Management: Reconciling Ecological Rehabilitation and Service Reliability, Oxford University Press, New York.
Roe, E., Schulman, P, and Eeten, M.J.G. van, (2003), Real-Time Reliability: Provision of Electricity Under Adverse Performance Conditions Arising from California’s Electricity Restructuring and Crisis, A report prepared for the California Energy Commission, Lawrence Berkeley National Laboratory, and the Electrical Power Research Institute, Energy Commission, San Francisco, California.
Sagan, S., (1993), The Limits of Safety, Princeton University Press, Princeton.
Salvendy, G., (1997), Handbook of Human Factors and Ergonomics, Wiley, New York.
Sanne, J. M., (2000), Creating Safety in Air Traffic Control, Arkiv Forlag, Lund, Sweden.
Schulman, P.R., (1993a), The Negotiated Order of Organisational Reliability, Administration and Society, Volume 25, pp. 353-372.
Schulman, P.R., (1993b), The Analysis of High Reliability Organisations: a Comparative Framework, in Roberts, K. (ed.) New Challenges To Understanding Organisations, pp. 33-54, Macmillan, New York.

Schulman, P.R., (2002), Medical Errors: How Reliable Is Reliability Theory?, in Rosenthal, M.M. and Sutcliffe, K. M. (eds.), Medical Error, pp. 200-216. Jossey Bass, San Francisco.

Turner, B. M., (1978), Man-Made Disasters, Wykeham, London.

Vaughn, D., (1996),The Challenger Launch Decision, University of Chicago Press, Chicago.
Wasserman, S., Faust, K. and Iacobucci, D., (1994), Social Network Analysis, Cambridge University Press, New York.
Weick, K.E., (1987), Organisational Culture As A Source of High Reliability, California Management Review, Volume 29, pp. 112-127.
Weick, K. E., (1993), The Vulnerable System: An Analysis of the Tenerife Air Disaster, in Roberts, K., (ed.), New Challenges To Understanding Organisations, pp. 173-197, Macmillan, New York.
Weick, K.E., Sutcliffe, K.M., and Obstfeld, D., (1999) Organizing For High Reliability, Research in Organisational Behaviour, Volume 21, pp. 81-123.
Weick, K.E. and Sutcliffe, K.M., (2001), Managing The Unexpected, Jossey Bass, San Francisco.
Wildavsky, A., (1988), Searching For Safety, Transaction Books, New Brunswick, NJ.

Figure 1: California’s high reliability network for restructured electricity provision
img-0.jpeg

Figure 2: Performance conditions for the focal ISO control room

System Instability
High Low
Network Option Variety High Just-in-time performance Just-in-case performance
Low Just-for-now performance Just-this-way performance

Table 1: Features of High Reliability Performance Modes

Performance mode
Just-in-case Just-in-time Just-for-now Just-this-way
Instability low high high low
Option variety high high low low
Principal feature high redundancy real-time flexibility maximum potential for amplified deviance command & control
Equifinality maximum equifinality adaptive equifinality low equifinality zero equifinality
Operational risks risk of inattention & complacency risk of misjudgement because of time & system constraints risk of exhausted options & lack of manoeuvrability (most untenable mode) risk of control failure over what needs to be controlled
Variablesof attention structural variables (e.g., operating reserves) escalating variables (e.g., cascading accidents) triggering variables (e.g., a single push over the edge) control variables (e.g., enforcing load shedding requirements)
Information strategy vigilant watchfulness keeping the bubble localised firefighting compliance monitoring
Lateral communication little lateral communication during routine operations rich, lateral communication for complex system operations in realtime lateral communication around focused issues and events little lateral communication, during fixed protocol (closest to command & control)
Rules & procedures performing according to wideranging established rules and procedures performing in & outside analysis; many situations not covered by procedures performing reactively, waiting for something to happen" performing to very specific set of detailed procedures
Orientation toward Area Control Error having control keeping control losing control forcing command & control

Table 2
Comparison of selected features of HRO and HRN management

HRO-Management (HRO research on Diablo Canyon) HRN-Management (research on California’s electricity HRN)
- high technical competence - high technical competence
- constant search for improvement - constant short-term search for improvement
- often hazard-driven adaptation to ensure safety - often hazard-driven adaptation to ensure safety
- often highly complex activities - often highly focused complex activities
- reliability is non-fungible - reliability is non-fungible in real time, except when service reliability jeopardises grid reliability
- limitations on trial & error learning - operation within anticipatory analysis - real-time operations necessitating improvisation and experimentation - operations outside analysis
- flexible authority patterns within HRO - flexible authority patterns within focal organisation and across HRN
- positive, design-based redundancy to ensure stability of inputs and outputs - maximum equifinality (positive redundancy), adaptive equifinality (not necessarily designed), and zero equifinality, all depending on network performance conditions

Figure 3- Reliability Space and Key Professional Activities

Scope of Reliability Focus
System- Wide (All cases) Specific Event (Single case)
Knowledge Base Representa-tional (Formal,deductive) DESIGN Contingency Scenarios
Reliability Professionals
Strategic Adaptations
Experiential (Tacit; often trial and error) REACTIVE OPERATIONS