Multivariate data analysis: quo vadis? I. Object-oriented data modelling (OODM) (original) (raw)

Multivariate data analysis: quo vadis? II. Levels of data-modelling objectives and possibilities

Journal of Chemometrics, 2003

Industry and academe are characterized by steadily increasing huge amounts of data with very different data structures. Both static and dynamic data contexts need to be addressed. A new generic, flexible and comprehensive general data-modelling concept is needed to cope with these demands. During the past 20 years, object-oriented programming (OOP) has become a de facto industry standard of how programming tasks should be defined and carried out in the context of deterministic data modelling. We present here a first framework of analogous ideas for multivariate data analysis. A new strategy, object-oriented data modelling (OODM), is proposed which is invariant with respect to the specific data structures and the practical data context. We present a first delineation of metaprinciples, ideas and stimulants for tomorrow's possible development paths of modelling, in which the fundamental data analysis unit is the generalized`PLS object' in the OOP sense. The key novel aspect concerns inter-object information transfer, facilitated by`root-sum-of-squares averaging' (RSSA), which uses w loading weights as between-object transfer agents. These features allow a powerful generalization beyond multiblock as well as hierarchical bilinear modelling to be laid out. The present part I outlines a first framework for the new data-modelling approach, while part II forms a complementing catalogue of specific options and possibilities when implementing the new principles.

Multivariate data analysis:quo vadis?

Journal of Chemometrics, 2003

Industry and academe are characterized by steadily increasing huge amounts of data with very different data structures. Both static and dynamic data contexts need to be addressed. A new generic, flexible and comprehensive general data-modelling concept is needed to cope with these demands. During the past 20 years, object-oriented programming (OOP) has become a de facto industry standard of how programming tasks should be defined and carried out in the context of deterministic data modelling. We present here a first framework of analogous ideas for multivariate data analysis. A new strategy, object-oriented data modelling (OODM), is proposed which is invariant with respect to the specific data structures and the practical data context. We present a first delineation of metaprinciples, ideas and stimulants for tomorrow's possible development paths of modelling, in which the fundamental data analysis unit is the generalized`PLS object' in the OOP sense. The key novel aspect concerns inter-object information transfer, facilitated by`root-sum-of-squares averaging' (RSSA), which uses w loading weights as between-object transfer agents. These features allow a powerful generalization beyond multiblock as well as hierarchical bilinear modelling to be laid out. The present part I outlines a first framework for the new data-modelling approach, while part II forms a complementing catalogue of specific options and possibilities when implementing the new principles.

Extending the UML for Multidimensional Modeling

«UML» 2002—The Unified Modeling …, 2002

Multidimensional (MD) modeling is the foundation of data warehouses, MD databases, and OLAP applications. In the last years, there have been some proposals to represent MD properties at the conceptual level. Nevertheless, none of them considers all multidimensional properties at both the structural and dynamic levels. In this paper, we present an object-oriented (OO) approach to accomplish the MD modeling at the conceptual level. This approach denes an extension by means of stereotypes to the the Unied Modeling Language (UML) for MD modeling. The extension uses the Object Constraint Language (OCL) for expressing well-formedness rules of the new dened elements. The advantages of our proposal are twofold: on the one hand, this extension allows us to represent MD models with the UML, allowing us to specify the whole system in a uniform way; on the other hand, an OO approach can elegantly consider main MD properties at the conceptual level. Finally, we show how to use these stereotypes in Rational Rose 2000 for MD modeling

Advances in Object-Oriented Data Modeling

Advances in Object-Oriented Data Modeling, 2000

Object-oriented Modeling has become the prime methodology for modern software design. Not since the conception of Structured Programming appeared has a new software technology had a similar impact. Today many textbooks, professional guides, and CASE tools support object-oriented software design. However, objectoriented data modeling has not kept pace, and the papers in this volume illustrate a range of issues that are still being dealt with. Object-orientation in software creation is simpler than object-oriented data modeling, because a specific program represents one approach to a solution, and hence one point-of-view. Data are commonly shared, and participants can hence approach the modeling from multiple points-of-view. For instance, early relational systems supported implicitly multiple points-of-view, since they only provided the simple semantics of isolated tables (3). The relational model complements the simple storage structure with algebraic manipulation of these structures. Moving to a calculus allowed automation in processing of "what" queries rather than following programmatic "how" instructions. Having an algebra also enabled the optimizations that were required. Alternate expressions over the tables define alternate views, which are mutually independent. Even now, relational processing capabilities remain weak. The relational SQL language has mainly one verb: "SELECT". UPDATES are severely restricted to the full database, since views, essential to understand subsets of complex data-structures, cannot be updated in general. To assure consistency among views there has to be more, namely a shared model. Entity-Relationship models provided quantitative structural semantics (2), but, until recently, this information remained in the design phase, and at most provided documentation for subsequent program creation. A formalization of the Entity-Relationship model, allowing matching of the relational transfers, the Structural Model (5) did not have a significant impact, since data modeling remained informal until objects started to emerge as first class data structures (1). Subsequent additions to relational systems provide the specification of integrity constraints, and these will limit the structural choices. For instance, combining uniqueness and a reference constraint will assure conformance to a 1:n relationship among two tables. Providing constraints is important for consistency and sharability. Still, the methods used to manage conformance remain outside of this model, so that xii Foreword xviii Preface recent findings in the topic covered, as well as directions for future research and development. This book is unique in that it takes a unified view of different techniques and developments in the area of object-oriented data modeling and reports on recent work that can only be found scattered throughout the literature. This book is useful for both researchers, software professionals, and advanced students who are working, or intending to work, on the area of object-oriented modeling. Some familiarity with object-oriented programming languages and database systems is required. The reader will learn a variety of ways of applying the object-oriented paradigm in the context of data modeling. This book has a dual purpose. It can be used in advanced courses on object-oriented data modeling or object-oriented software development focused around database systems. Furthermore, it represents a valuable source of information for software engineers, developers, and project managers who wish to familiarize themselves with object-oriented data modeling techniques and methodologies and apply some of the material covered in this book into practice. xx Preface Modeling of Reverse Engineering Applications Although the interest in objectoriented databases is growing, a major limitation on their acceptance in the corporate world is the amount of time and money invested in existing databases using the older data models ("legacy systems"). Obviously, the huge undertaking needed to convert from one database paradigm to another is a major expense that few corporations are willing to readily accept. What is needed are tools that allow corporations to generate the conceptual schemata and reveal the hidden semantics of current database applications efficiently and with limited user involvement. This process is known as database "reverse engineering" (or reengineering). Reverse engineering can be defined as a process of discovering how a database system works. Whether using reverse engineering to migrate between different database paradigms (from hierarchical to relational, relational to object-oriented), elucidating undocumented systems, or using it to forward engineer existing systems, reverse engineering involves a wide collection of tasks. The pivotal feature of all these tasks is the need to identify all the components of existing database systems and the relationships between them. This part of the book describes advanced modeling techniques for reengineering legacy database applications. The contribution of these techniques relies not only on proposed (reengineering) methodologies but also on the their use in real environments. Two main approaches for reverse engineering are described. The first approach, by Missaoui, Goding, and Gagnon, presents a complete methodology for mapping conceptual schemata into structurally object-oriented schemata. The main advantages of such a methodology is the use of an adapted clustering technique allowing recursive grouping of objects (e.g., entities and relationships) from an extended entity-relationship schema. xxv Preface any time until the car is picked up. This process requires coordination between the relevant Branch Manager and the Depot Manager, to ensure the Service Diary and the Car Bookings file are in step. When a service has been completed, a description of the work done and the parts and labor cost are added to the car service history, and the parts and labor cost to the service diary.

Understanding analysis dimensions in a multidimensional object-oriented model

2001

OLAP defines a set of data warehousing query tools characterized by providing a multidimensional view of data. Information can be shown at different aggregation levels (often called granularities) for each dimension. In this paper, we try to outline the benefits of understanding the relationships between those aggregation levels as Part-Whole relationships, and how it helps to address some semantic problems. Moreover, we propose the usage of other Object-Oriented constructs to keep as much semantics as possible in analysis dimensions.

A software Infrastructure for Multidimensional data Analysis: A Data Modelling Aspect

Rapid changes in the technology lead to increased variety of data sources. These varied data sources generating data in the large volume and with extremely high speed. To accommodate and use this data in decision making systems is the big challenge. To make fullest use of the valuable data generated by different systems, target users of the analysis systems need to be increased. In general knowledge discovery process using the tools which are available requires the handsome expertise in the domain as well as in the technology. The project ITDA (Integrated Tool for Data Analysis) focuses to provide the complete platform for multidimensional data analysis to enhance the decision making process in every domain. This projects provides all the techniques required to perform multidimensional data analysis and avoids the overheads occurred by the traditional cube architecture followed by most of the analytics system. Modelling the available data in the multidimensional form is the basis and crucial step for multidimensional analysis. This work describes the multidimensional modelling aspect and its implementation using ITDA project.

A Comprehensive Framework on Multidimensional Modeling

Lecture Notes in Computer Science, 2011

In this paper we discuss what current multidimensional design approaches provide and which are their major flaws. Our contribution lays in a comprehensive framework that does not focus on how these approaches work but what they do provide for usage in real data warehouse projects. So that, we do not aim at comparing current approaches but set up a framework (based on four criteria: the role played by enduser requirements and data sources, the degree of automation achieved and the quality of the output produced) highlighting their drawbacks, and the need for further research on this area.

Design of a Multidimensional Model Using Object Oriented Features in UML

IARS International Research Journal

A data warehouse is a single repository of data which includes data generated from various operational systems. Conceptual modeling is an important concept in the successful design of a data warehouse. The Unified Modeling Language (UML) has become a standard for object modeling during analysis and design steps of software system development. The paper proposes an object oriented approach to model the process of data warehouse design. The hierarchies of each data element can be explicitly defined, thus highlighting the data granularity. We propose a UML multidimensional model using various data sources based on UML schemas. We present a conceptual-level integration framework on diverse UML data sources on which OLAP operations can be performed. Our integration framework takes into account the benefits of UML (its concepts, relationships and extended features) which is more close to the real world and can model even the complex problems easily and accurately. Two steps are involved i...

Extending the Multidimensional Data Model to Handle Complex Data

Journal of Computing Science and Engineering

Data Warehousing a nd OLAP (On-Line Ana lytical Processing) have t urned into t he key technology for comprehensive dat a a nalysis. Origina lly developed for the needs of d ecision support in business, dat a warehouses have proven to be an adequat e solution for a var iety of non-business a pplications a nd doma ins, s uch as government, research, a nd medicine. Ana lytical power of the OLAP technology comes from its underlying multidimensiona l d at a model, which a llows users to see dat a from di fferent perspectives. However, t his model displays a number of defici encies when a pplied to non-convent iona l scenarios and a na lysis t asks. This paper presents a n at tempt to syst ema tically summarize va rious ext ensions of the origina l multidimensiona l d ata model that have been proposed by researchers a nd practit ioners in the recent years. Presented concepts are arra nged into a forma l classification consisting of fact types, factua l and fact-dimensiona l relationships, a nd dimension types, supplied wit h expla natory examples fro m real-world usage scena rios. Both t he st atic elements of the model, such as types of fact a nd dimension hierarchy schemes, and dynamic features, such as support for adva nced operat ors a nd derived elements. We a lso propose a semant ically rich gra phical notat ion called X-DFM that extends the p opular Dimensiona l Fact Model by refinin g and modifying t he set of constructs as to make it coherent wit h the for mal model. An eva luat ion of our framework against a set of co mmon modeling requirements summarizes the contribution .