Michael Akinde - Academia.edu (original) (raw)

Papers by Michael Akinde

Research paper thumbnail of Maintenance and Computation of Complex Aggregate Views: Universal Unnesting of Expressions with Subqueries

Research paper thumbnail of The MD-join: An operator for complex OLAP

Data Engineering, …, Jan 1, 2001

OLAP queries (i.e. group-by or cube-by queries with aggregation) have proven to be valuable for d... more OLAP queries (i.e. group-by or cube-by queries with aggregation) have proven to be valuable for data analysis and exploration. Many decision support applications need very complex OLAP queries, requiring a fine degree of control over both the group definition and the aggregates that are computed. For example, suppose that the user has access to a data cube whose measure attribute is Sum(Sales). Then the user might wish to compute the sum of sales in New York and the sum of sales in California for those data cube entries in which Sum(Sales) > $1,000,000.

Research paper thumbnail of Constructing GPSJ view graphs

Proceedings of the International Workshop on …, Jan 1, 1999

A data warehouse collects and maintains integrated information from heterogeneous data sources fo... more A data warehouse collects and maintains integrated information from heterogeneous data sources for OLAP and decision support. An important task in data warehouse design is the selection of views to materialize, in order to minimize the response time and maintenance cost of generalized project-select-join (GPSJ) queries.

Research paper thumbnail of Generalized MD-joins: Evaluation and reduction to SQL

Databases in Telecommunications II, Jan 1, 2001

Page 1. W. Jonker (Ed.): Databases in Telecommunications II, LNCS 2209, pp. 52-67, 2001. © Spring... more Page 1. W. Jonker (Ed.): Databases in Telecommunications II, LNCS 2209, pp. 52-67, 2001. © Springer-Verlag Berlin Heidelberg 2001 Generalized MD-Joins: Evaluation and Reduction to SQL Michael O. Akinde and Michael H. Böhlen ...

Research paper thumbnail of Efficient computation of subqueries in complex OLAP

Data Engineering, 2003. Proceedings …, Jan 1, 2003

Expressing complex OLAP queries using group-by, aggregation, and joins can be extremely difficult... more Expressing complex OLAP queries using group-by, aggregation, and joins can be extremely difficult. As a result of this, many alternate ways of expressing such queries have been developed by database researchers. The use of nested query expressions (subqueries in SQL), are a natural part of these techniques. Recent work has demonstrated how any nested query expression can be rewritten using algebraic operators. However, the solutions have focused on join/outer-join computations, which are not efficient in an OLAP context where huge fact tables are present.

Research paper thumbnail of Efficient OLAP query processing in distributed data warehouses

Information Systems, Jan 1, 2003

The success of Internet applications has led to an explosive growth in the demand for bandwidth f... more The success of Internet applications has led to an explosive growth in the demand for bandwidth from Internet Service Providers. Managing an Internet protocol network requires collecting and analyzing network data, such as flow-level traffic statistics. Such analyses can typically be expressed as OLAP queries, e.g., correlated aggregate queries and data cubes. Current day OLAP tools for this task assume the availability of the data in a centralized data warehouse. However, the inherently distributed nature of data collection and the huge amount of data extracted at each collection point make it impractical to gather all data at a centralized site. One solution is to maintain a distributed data warehouse, consisting of local data warehouses at each collection point and a coordinator site, with most of the processing being performed at the local sites. In this paper, we consider the problem of efficient evaluation of OLAP queries over a distributed data warehouse. We have developed the Skalla system for this task. Skalla translates OLAP queries, specified as certain algebraic expressions, into distributed evaluation plans which are shipped to individual sites. A salient property of our approach is that only partial results are shipped -never parts of the detail data. We propose a variety of optimizations to minimize both the synchronization traffic and the local processing done at each site. We finally present an experimental study based on TPC-R data. Our results demonstrate the scalability of our techniques and quantify the performance benefits of the optimization techniques that have gone into the Skalla system. r

Research paper thumbnail of Maintenance and Computation of Complex Aggregate Views: Universal Unnesting of Expressions with Subqueries

Research paper thumbnail of The MD-join: An operator for complex OLAP

Data Engineering, …, Jan 1, 2001

OLAP queries (i.e. group-by or cube-by queries with aggregation) have proven to be valuable for d... more OLAP queries (i.e. group-by or cube-by queries with aggregation) have proven to be valuable for data analysis and exploration. Many decision support applications need very complex OLAP queries, requiring a fine degree of control over both the group definition and the aggregates that are computed. For example, suppose that the user has access to a data cube whose measure attribute is Sum(Sales). Then the user might wish to compute the sum of sales in New York and the sum of sales in California for those data cube entries in which Sum(Sales) > $1,000,000.

Research paper thumbnail of Constructing GPSJ view graphs

Proceedings of the International Workshop on …, Jan 1, 1999

A data warehouse collects and maintains integrated information from heterogeneous data sources fo... more A data warehouse collects and maintains integrated information from heterogeneous data sources for OLAP and decision support. An important task in data warehouse design is the selection of views to materialize, in order to minimize the response time and maintenance cost of generalized project-select-join (GPSJ) queries.

Research paper thumbnail of Generalized MD-joins: Evaluation and reduction to SQL

Databases in Telecommunications II, Jan 1, 2001

Page 1. W. Jonker (Ed.): Databases in Telecommunications II, LNCS 2209, pp. 52-67, 2001. © Spring... more Page 1. W. Jonker (Ed.): Databases in Telecommunications II, LNCS 2209, pp. 52-67, 2001. © Springer-Verlag Berlin Heidelberg 2001 Generalized MD-Joins: Evaluation and Reduction to SQL Michael O. Akinde and Michael H. Böhlen ...

Research paper thumbnail of Efficient computation of subqueries in complex OLAP

Data Engineering, 2003. Proceedings …, Jan 1, 2003

Expressing complex OLAP queries using group-by, aggregation, and joins can be extremely difficult... more Expressing complex OLAP queries using group-by, aggregation, and joins can be extremely difficult. As a result of this, many alternate ways of expressing such queries have been developed by database researchers. The use of nested query expressions (subqueries in SQL), are a natural part of these techniques. Recent work has demonstrated how any nested query expression can be rewritten using algebraic operators. However, the solutions have focused on join/outer-join computations, which are not efficient in an OLAP context where huge fact tables are present.

Research paper thumbnail of Efficient OLAP query processing in distributed data warehouses

Information Systems, Jan 1, 2003

The success of Internet applications has led to an explosive growth in the demand for bandwidth f... more The success of Internet applications has led to an explosive growth in the demand for bandwidth from Internet Service Providers. Managing an Internet protocol network requires collecting and analyzing network data, such as flow-level traffic statistics. Such analyses can typically be expressed as OLAP queries, e.g., correlated aggregate queries and data cubes. Current day OLAP tools for this task assume the availability of the data in a centralized data warehouse. However, the inherently distributed nature of data collection and the huge amount of data extracted at each collection point make it impractical to gather all data at a centralized site. One solution is to maintain a distributed data warehouse, consisting of local data warehouses at each collection point and a coordinator site, with most of the processing being performed at the local sites. In this paper, we consider the problem of efficient evaluation of OLAP queries over a distributed data warehouse. We have developed the Skalla system for this task. Skalla translates OLAP queries, specified as certain algebraic expressions, into distributed evaluation plans which are shipped to individual sites. A salient property of our approach is that only partial results are shipped -never parts of the detail data. We propose a variety of optimizations to minimize both the synchronization traffic and the local processing done at each site. We finally present an experimental study based on TPC-R data. Our results demonstrate the scalability of our techniques and quantify the performance benefits of the optimization techniques that have gone into the Skalla system. r