The Matrix: Revisited - DecisionWorks Consulting (original) (raw)

Print Friendly, PDF & Email

With the current industry buzz focused on master data management (MDM), it’s time to revisit one of the most critical elements of the Kimball method. Back in 1999, Ralph Kimball wrote an Intelligent Enterprise column entitled The Matrix. The 1999 movie of the same name spawned two sequels, but we haven’t devoted a column to our matrix in more than six years.

Dimensional modelers strive to deliver information in a way that’s easily understood. The same objectives apply when representing an organization’s breadth of performance information and associated descriptive reference data. That sounds like a formidable task, but the tabular row-and-column orientation of a matrix lends itself beautifully to the challenge. The enterprise data warehouse bus matrix is akin to a Swiss Army pocket knife for data warehouse and business intelligence professionals; it’s one tool that serves multiple purposes, including architecture planning, data integration coordination and organizational communication.

Matrix Columns for Reference Data

Delving into bus matrix fundamentals, let’s start with the matrix columns, which address the demands of master data management and data integration head-on. Each column of the bus matrix corresponds to a natural grouping of standardized, descriptive reference data. In Kimball parlance, the matrix’s columns are conformed dimensions containing the textual attributes used for filtering, constraining, grouping or labeling. Each attribute has an agreed-upon name, definition and domain values to ensure consistent data presentation, interpretation and content. The bus matrix includes separate columns to identify the “who, what, where, when, why and how,” such as date, customer, product and employee associated with each business event or transactional activity.

Much is being said and written lately about the importance and value of master data management and data integration. We wholeheartedly agree; the Kimball Group has been talking about conformed dimensions since our first use of the terminology in 1984. We’re thrilled others are jumping on the soapbox and embracing the concepts. It’s simply unacceptable to build separate data stores (warehouses, marts, cubes and so on) without a framework to tie the data together. Reusable conformed dimensions supply potent integration glue, letting businesses drill across core processes with consistent, unified views.

Data Stewardship Required

Unfortunately, you can’t purchase a wonder product to create conformed dimensions and miraculously solve your organization’s master data management issues. Defining master conformed dimensions to be used across the enterprise is a cultural and geopolitical challenge. Technology can facilitate and enable data integration, but it doesn’t fix the problem. Data stewardship must be a key component of your solution.

In our experience, the most effective data stewards come from the business community. As with technology, the data warehouse team facilitates and enables stewardship by identifying problems and opportunities and then implementing the agreed-upon decisions to create, maintain and distribute “gold standard” dimensions. But the subject-matter experts in the business are the ones rationalizing the diverse business perspectives and driving to common reference data. To reach a consensus, senior business and IT executives must openly promote and support the stewardship process and its outcomes, including the inevitable compromises.

Over the years many have criticized the concept of conformed dimensions as being “too hard.” Yes, it’s difficult to get people in different corners of the business to agree on common attribute names, definitions and values, but that’s the crux of unified, integrated data. If everyone demands their own labels and business rules, then there’s no chance of delivering the single version of the truth DW promises.

Process-Centric Rows

While the matrix columns refer to the business’s nouns, the matrix rows are typically expressed as verbs. Each row of the bus matrix corresponds to a business process within the organization. A business process is an activity that the business performs, such as taking orders, shipping, invoicing, receiving payments and handling service calls. In most cases, measurements or metrics are generated each time one of these actions or events occurs. When the order is taken, the source system captures the order quantities and amounts. At shipping and customer invoicing, you again deal with quantities and amounts, albeit different from the order metrics. Each customer payment has an amount associated with it. Finally, as the business receives service calls from customers, metrics such as call duration are captured.

Each business process is typically supported by an operational system, which can introduce complexity. Some of you may deal with large, monolithic source systems that support a handful of business processes; conversely, others may have several order source systems in their environments. Insert a row in the matrix for each business process that collects or generates unique performance metrics with unique dimensionality.

After listing the core business process rows, you might also identify more complex cross-process or consolidated rows. These consolidated rows can be extremely beneficial analytically, but they are typically much more difficult to implement given the need to combine and potentially allocate performance metrics from multiple source systems; they should be tackled after the underlying processes have been built.

Associate Matrix Columns and Rows

Once you’ve determined several dozen core processes and dimensions, shade the matrix cells to indicate which columns are related to each row. Presto! You see the logical relationships and complex interplay between the organization’s conformed reference dimensions and key business processes. By looking across a row, you quickly understand its dimensionality. Looking down a column gives immediate feedback about conformed dimension opportunities and obstacles, visually highlighting dimensions that warrant special attention given their participation in multiple matrix rows.

The number of data warehouse bus matrix rows and columns varies by organization. For many, the matrix is surprisingly square, with approximately 25 to 40 rows and a comparable number of columns. However, there are some interesting industry-specific anomalies, such as in insurance and health care, where matrices typically have more columns than rows.

It’s relatively straightforward to lay out the matrix’s rows and columns, and you’re essentially defining the overall data architecture in the process. The matrix delivers the big-picture perspective, regardless of database or technology platform preferences, while also identifying reasonably manageable development efforts. Separate development teams can work on components of the matrix fairly independently, with confidence that the puzzle pieces will fit together.

The matrix is a succinct and effective communication tool. It lets you visually convey the entire plan within and across development teams, as well as upward and outward throughout the organization, including directly with senior IT and business management. The matrix isn’t intended to be a static document that’s created and then gathers dust. It will evolve as you delve more deeply into the requirements of the business and realities of your operational source systems.

Common Matrix Mishaps

When drafting a bus matrix, people sometimes struggle with the level of detail expressed by each column or row. Row mishaps commonly fall into the following two categories:

When defining the matrix columns, architects naturally fall into the similar traps of defining columns that are either too broad or narrow:

Matrix Extensions

One of the beauties of the bus matrix is its simplicity. You can reuse the familiar tabular format to convey other DW/BI “relationships.” These extensions aren’t a substitute for the enterprise DW bus matrix, but are intended as complementary opportunities to reuse the framework.