Frank Ramsak - Academia.edu (original) (raw)

Papers by Frank Ramsak

Research paper thumbnail of Integrating the UB-Tree into a Database System Kernel

Proceedings of the 26th International Conference on Very Large Data Bases, Sep 10, 2000

Multidimensional access methods have shown high potential for significant performance improvement... more Multidimensional access methods have shown high potential for significant performance improvements in various application domains. However, only few approaches have made their way into commercial products. In commercial database management systems (DBMSs) the B-Tree is still the prevalent indexing technique.

Research paper thumbnail of HINTA: A Linearization Algorithm for Physical Clustering of Complex OLAP Hierarchies

Dmdw, 2001

Hierarchies are an important means to categorize data stored in OLAP systems. OLAP queries follow... more Hierarchies are an important means to categorize data stored in OLAP systems. OLAP queries follow the drill/slice/dice-paradigm and therefore exhibit navigation patterns that follow the hierarchy of a dimension. In real-world applications, hierarchies are often unbalanced and share levels, resulting in complex hierarchy structures. So far, encoding methods for simple structured hierarchies have been introduced to handle hierarchies efficiently for query processing. In this paper we propose the HINTA algorithm to compute the clustering order for complex hierarchies by linearization. The physical clustering of OLAP data computed by HINTA significantly improves the performance of OLAP queries. HINTA enables clustering of complex hierarchies that can share hierarchy levels in several classifications over one dimension.

Research paper thumbnail of The Transbase Hypercube RDBMS: Multidimensional Indexing of Relational Tables

Icde, 2001

world databases for OLAP. However, we also address general issues of UB-Trees like creation, spac... more world databases for OLAP. However, we also address general issues of UB-Trees like creation, spacerequirements, or comparison to other indexing methods.

Research paper thumbnail of Physical Data Modeling for Multidimensional Access Methods

CITATIONS 0 READS 15 4 authors, including:

Research paper thumbnail of Hauptseminar XML-Datenbanken

Research paper thumbnail of Multidimensional Mapping and Indexing of XML

We propose a multidimensional approach to store XML data in relational database systems. In contr... more We propose a multidimensional approach to store XML data in relational database systems. In contrast to other efforts we suggest a solution to the problem using established database technology. We present a multidimensional mapping scheme for XML and also thoroughly study the impact of established and commercially available multidimensional index structures (compound B-Trees and UB-Trees) on the performance of the mapping scheme. In addition, we compare our multidimensional mapping to other known mapping schemes. While studying the performance we have identified projection and selection to be fundamental parts of a typical query on XML documents. Our measurements show that projection and selection are orthogonal and require special multidimensional index support to be processed efficiently.

Research paper thumbnail of Combining hierarchy encoding and pre-grouping: intelligent grouping in star join processing

Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405), 2003

Research paper thumbnail of Improving OLAP performance by multidimensional hierarchical clustering

Proceedings. IDEAS'99. International Database Engineering and Applications Symposium (Cat. No.PR00265), 1999

Data-warehousing applications cope with enormous data sets in the range of Gigabytes and Terabyte... more Data-warehousing applications cope with enormous data sets in the range of Gigabytes and Terabytes. Queries usually either select a very small set of this data or perform aggregations on a fairly large data set. Materialized views storing pre-computed aggregates are used to efficiently process queries with aggregations. This approach increases resource requirements in disk space and slows down updates because of the view maintenance problem. Multidimensional hierarchical clustering (MHC) of OLAP data overcomes these problems while offering more flexibility for aggregation paths. Clustering is introduced as a way to speed up aggregation queries without additional storage cost for materialization. Performance and storage cost of our access method are investigated and compared to current query processing scenarios. In addition performance measurements on real world data for a typical star schema are presented.

Research paper thumbnail of Shooting Stars in the Sky

VLDB '02: Proceedings of the 28th International Conference on Very Large Databases, 2002

Research paper thumbnail of Shooting Stars in the Sky: An Online Algorithm for Skyline Queries

Very Large Data Bases, 2002

Skyline queries ask for a set of interesting points from a potentially large set of data points. ... more Skyline queries ask for a set of interesting points from a potentially large set of data points. If we are traveling, for instance, a restaurant might be interesting if there is no other restaurant which is nearer, cheaper, and has better food. Skyline queries retrieve all such interesting restaurants so that the user can choose the most promising one. In

Research paper thumbnail of Interactive ROLAP on large datasets: a case study with UB-trees

Proceedings 2001 International Database Engineering and Applications Symposium, 2001

Online Analytical Processing (OLAP) requires query response times within the range of a few secon... more Online Analytical Processing (OLAP) requires query response times within the range of a few seconds in order to allow for interactive drilling, slicing, or dicing through an OLAP cube. While small OLAP applications use multidimensional database systems, large OLAP applications like the SAP BW rely on relational (ROLAP) databases for efficient data storage and retrieval. ROLAP databases use specialized data models like star or snowflake schemata for data storage and create a large set of indexes or materialized views in order to answer queries efficiently. In our case study, we show the performance benefits of TransBase HyperCube, a commercial RDBMS, whose kernel fully integrates the UB-Tree, a multi-dimensional extension of the B-Tree. With this newly developed access structure, TransBase HyperCube enables interactive OLAP without the need of storing a large set of materialized views or creating a large set of indexes. We compare not only the query performance, but also consider index size and maintenance costs. For the case study we use a 42 million record ROLAP database of GfK, the largest German market research company.

Research paper thumbnail of Transbase®: A leading-edge ROLAP Engine supporting multidimensional Indexing and Hierarchy Clustering∇

Analysis-oriented database applications, such as data warehousing or customer relationship manage... more Analysis-oriented database applications, such as data warehousing or customer relationship management, play a crucial role in the database area. In general, the multidimensional data model is used in these applications, realized as star or snow-flake schemata in the relational world. The so-called star queries are the prevalent type of queries on such schemata. All database vendors have extended their products to support star queries efficiently. However, mostly reporting queries benefit from the optimizations, like pre-aggregation, while ad-hoc queries usually lack efficient support. We present the DBMS Transbase® in this paper, which provides a new physical organization of the data based on hierarchical clustering and multidimensional clustering combined with multidimensional indexing. In combination with new query optimizations (e.g., hierarchical pre-grouping) significant performance improvements are achieved. The paper describes how the new technology is implemented in the Transbase® product and how it is made available to the user as transparently as possible. The benefits are illustrated with a real-world data warehousing scenario.

Research paper thumbnail of The transbase hypercube rdbms: Multidimensional indexing of relational tables

world databases for OLAP. However, we also address general issues of UB-Trees like creation, spac... more world databases for OLAP. However, we also address general issues of UB-Trees like creation, spacerequirements, or comparison to other indexing methods.

Research paper thumbnail of HINTA: a linearization algorithm for physical clustering of complex OLAP hierarchies

Hierarchies are an important means to categorize data stored in OLAP systems. OLAP queries follow... more Hierarchies are an important means to categorize data stored in OLAP systems. OLAP queries follow the drill/slice/dice-paradigm and therefore exhibit navigation patterns that follow the hierarchy of a dimension. In real-world applications, hierarchies are often unbalanced and share levels, resulting in complex hierarchy structures. So far, encoding methods for simple structured hierarchies have been introduced to handle hierarchies efficiently for query processing. In this paper we propose the HINTA algorithm to compute the clustering order for complex hierarchies by linearization. The physical clustering of OLAP data computed by HINTA significantly improves the performance of OLAP queries. HINTA enables clustering of complex hierarchies that can share hierarchy levels in several classifications over one dimension.

Research paper thumbnail of Integrating the UB-tree into a database system kernel

Proc. VLDB, 2000

Multidimensional access methods have shown high potential for significant performance improvement... more Multidimensional access methods have shown high potential for significant performance improvements in various application domains. However, only few approaches have made their way into commercial products. In commercial database management systems (DBMSs) the B-Tree is still the prevalent indexing technique.

Research paper thumbnail of Processing Star Queries on Hierarchically-Clustered Fact Tables

VLDB '02: Proceedings of the 28th International Conference on Very Large Databases, 2002

Star queries are the most prevalent kind of queries in data warehousing, OLAP and business intell... more Star queries are the most prevalent kind of queries in data warehousing, OLAP and business intelligence applications. Thus, there is an imperative need for efficiently processing star queries. To this end, a new class of fact table organizations has emerged that exploits path-based surrogate keys in order to hierarchically cluster the fact table data of a star schema [DRSN98, MRB99, KS01]. In the context of these new organizations, star query processing changes radically. In this paper, we present a complete abstract processing plan that captures all the necessary steps in evaluating such queries over hierarchically clustered fact tables. Furthermore, we present optimizations for surrogate key processing and a novel early grouping transformation for grouping on the dimension hierarchies. Our algorithms have been already implemented in a commercial relational database management system (RDBMS) and the experimental evaluation, as well as customer feedback, indicates speedups of orders of magnitude for typical star queries in real world applications.

Research paper thumbnail of Integrating the UB-Tree into a Database System Kernel

Proceedings of the 26th International Conference on Very Large Data Bases, Sep 10, 2000

Multidimensional access methods have shown high potential for significant performance improvement... more Multidimensional access methods have shown high potential for significant performance improvements in various application domains. However, only few approaches have made their way into commercial products. In commercial database management systems (DBMSs) the B-Tree is still the prevalent indexing technique.

Research paper thumbnail of HINTA: A Linearization Algorithm for Physical Clustering of Complex OLAP Hierarchies

Dmdw, 2001

Hierarchies are an important means to categorize data stored in OLAP systems. OLAP queries follow... more Hierarchies are an important means to categorize data stored in OLAP systems. OLAP queries follow the drill/slice/dice-paradigm and therefore exhibit navigation patterns that follow the hierarchy of a dimension. In real-world applications, hierarchies are often unbalanced and share levels, resulting in complex hierarchy structures. So far, encoding methods for simple structured hierarchies have been introduced to handle hierarchies efficiently for query processing. In this paper we propose the HINTA algorithm to compute the clustering order for complex hierarchies by linearization. The physical clustering of OLAP data computed by HINTA significantly improves the performance of OLAP queries. HINTA enables clustering of complex hierarchies that can share hierarchy levels in several classifications over one dimension.

Research paper thumbnail of The Transbase Hypercube RDBMS: Multidimensional Indexing of Relational Tables

Icde, 2001

world databases for OLAP. However, we also address general issues of UB-Trees like creation, spac... more world databases for OLAP. However, we also address general issues of UB-Trees like creation, spacerequirements, or comparison to other indexing methods.

Research paper thumbnail of Physical Data Modeling for Multidimensional Access Methods

CITATIONS 0 READS 15 4 authors, including:

Research paper thumbnail of Hauptseminar XML-Datenbanken

Research paper thumbnail of Multidimensional Mapping and Indexing of XML

We propose a multidimensional approach to store XML data in relational database systems. In contr... more We propose a multidimensional approach to store XML data in relational database systems. In contrast to other efforts we suggest a solution to the problem using established database technology. We present a multidimensional mapping scheme for XML and also thoroughly study the impact of established and commercially available multidimensional index structures (compound B-Trees and UB-Trees) on the performance of the mapping scheme. In addition, we compare our multidimensional mapping to other known mapping schemes. While studying the performance we have identified projection and selection to be fundamental parts of a typical query on XML documents. Our measurements show that projection and selection are orthogonal and require special multidimensional index support to be processed efficiently.

Research paper thumbnail of Combining hierarchy encoding and pre-grouping: intelligent grouping in star join processing

Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405), 2003

Research paper thumbnail of Improving OLAP performance by multidimensional hierarchical clustering

Proceedings. IDEAS'99. International Database Engineering and Applications Symposium (Cat. No.PR00265), 1999

Data-warehousing applications cope with enormous data sets in the range of Gigabytes and Terabyte... more Data-warehousing applications cope with enormous data sets in the range of Gigabytes and Terabytes. Queries usually either select a very small set of this data or perform aggregations on a fairly large data set. Materialized views storing pre-computed aggregates are used to efficiently process queries with aggregations. This approach increases resource requirements in disk space and slows down updates because of the view maintenance problem. Multidimensional hierarchical clustering (MHC) of OLAP data overcomes these problems while offering more flexibility for aggregation paths. Clustering is introduced as a way to speed up aggregation queries without additional storage cost for materialization. Performance and storage cost of our access method are investigated and compared to current query processing scenarios. In addition performance measurements on real world data for a typical star schema are presented.

Research paper thumbnail of Shooting Stars in the Sky

VLDB '02: Proceedings of the 28th International Conference on Very Large Databases, 2002

Research paper thumbnail of Shooting Stars in the Sky: An Online Algorithm for Skyline Queries

Very Large Data Bases, 2002

Skyline queries ask for a set of interesting points from a potentially large set of data points. ... more Skyline queries ask for a set of interesting points from a potentially large set of data points. If we are traveling, for instance, a restaurant might be interesting if there is no other restaurant which is nearer, cheaper, and has better food. Skyline queries retrieve all such interesting restaurants so that the user can choose the most promising one. In

Research paper thumbnail of Interactive ROLAP on large datasets: a case study with UB-trees

Proceedings 2001 International Database Engineering and Applications Symposium, 2001

Online Analytical Processing (OLAP) requires query response times within the range of a few secon... more Online Analytical Processing (OLAP) requires query response times within the range of a few seconds in order to allow for interactive drilling, slicing, or dicing through an OLAP cube. While small OLAP applications use multidimensional database systems, large OLAP applications like the SAP BW rely on relational (ROLAP) databases for efficient data storage and retrieval. ROLAP databases use specialized data models like star or snowflake schemata for data storage and create a large set of indexes or materialized views in order to answer queries efficiently. In our case study, we show the performance benefits of TransBase HyperCube, a commercial RDBMS, whose kernel fully integrates the UB-Tree, a multi-dimensional extension of the B-Tree. With this newly developed access structure, TransBase HyperCube enables interactive OLAP without the need of storing a large set of materialized views or creating a large set of indexes. We compare not only the query performance, but also consider index size and maintenance costs. For the case study we use a 42 million record ROLAP database of GfK, the largest German market research company.

Research paper thumbnail of Transbase®: A leading-edge ROLAP Engine supporting multidimensional Indexing and Hierarchy Clustering∇

Analysis-oriented database applications, such as data warehousing or customer relationship manage... more Analysis-oriented database applications, such as data warehousing or customer relationship management, play a crucial role in the database area. In general, the multidimensional data model is used in these applications, realized as star or snow-flake schemata in the relational world. The so-called star queries are the prevalent type of queries on such schemata. All database vendors have extended their products to support star queries efficiently. However, mostly reporting queries benefit from the optimizations, like pre-aggregation, while ad-hoc queries usually lack efficient support. We present the DBMS Transbase® in this paper, which provides a new physical organization of the data based on hierarchical clustering and multidimensional clustering combined with multidimensional indexing. In combination with new query optimizations (e.g., hierarchical pre-grouping) significant performance improvements are achieved. The paper describes how the new technology is implemented in the Transbase® product and how it is made available to the user as transparently as possible. The benefits are illustrated with a real-world data warehousing scenario.

Research paper thumbnail of The transbase hypercube rdbms: Multidimensional indexing of relational tables

world databases for OLAP. However, we also address general issues of UB-Trees like creation, spac... more world databases for OLAP. However, we also address general issues of UB-Trees like creation, spacerequirements, or comparison to other indexing methods.

Research paper thumbnail of HINTA: a linearization algorithm for physical clustering of complex OLAP hierarchies

Hierarchies are an important means to categorize data stored in OLAP systems. OLAP queries follow... more Hierarchies are an important means to categorize data stored in OLAP systems. OLAP queries follow the drill/slice/dice-paradigm and therefore exhibit navigation patterns that follow the hierarchy of a dimension. In real-world applications, hierarchies are often unbalanced and share levels, resulting in complex hierarchy structures. So far, encoding methods for simple structured hierarchies have been introduced to handle hierarchies efficiently for query processing. In this paper we propose the HINTA algorithm to compute the clustering order for complex hierarchies by linearization. The physical clustering of OLAP data computed by HINTA significantly improves the performance of OLAP queries. HINTA enables clustering of complex hierarchies that can share hierarchy levels in several classifications over one dimension.

Research paper thumbnail of Integrating the UB-tree into a database system kernel

Proc. VLDB, 2000

Multidimensional access methods have shown high potential for significant performance improvement... more Multidimensional access methods have shown high potential for significant performance improvements in various application domains. However, only few approaches have made their way into commercial products. In commercial database management systems (DBMSs) the B-Tree is still the prevalent indexing technique.

Research paper thumbnail of Processing Star Queries on Hierarchically-Clustered Fact Tables

VLDB '02: Proceedings of the 28th International Conference on Very Large Databases, 2002

Star queries are the most prevalent kind of queries in data warehousing, OLAP and business intell... more Star queries are the most prevalent kind of queries in data warehousing, OLAP and business intelligence applications. Thus, there is an imperative need for efficiently processing star queries. To this end, a new class of fact table organizations has emerged that exploits path-based surrogate keys in order to hierarchically cluster the fact table data of a star schema [DRSN98, MRB99, KS01]. In the context of these new organizations, star query processing changes radically. In this paper, we present a complete abstract processing plan that captures all the necessary steps in evaluating such queries over hierarchically clustered fact tables. Furthermore, we present optimizations for surrogate key processing and a novel early grouping transformation for grouping on the dimension hierarchies. Our algorithms have been already implemented in a commercial relational database management system (RDBMS) and the experimental evaluation, as well as customer feedback, indicates speedups of orders of magnitude for typical star queries in real world applications.