Datawarehouse Research Papers - Academia.edu (original) (raw)
Data warehouses (DW) play a decisive role in providing analytical information for decision making. Multidimensional modeling is a special approach to modeling data, considered the foundation for building data warehouses. With the... more
Data warehouses (DW) play a decisive role in providing analytical information for decision making. Multidimensional modeling is a special approach to modeling data, considered the foundation for building data warehouses. With the explosive growth in the amount of heterogeneous data (most of which external to the organization) in the latest years, the DW has been impacted by the need to interoperate and deal with the complexity of this new type of information, such as big data, data lakes and cognitive computing platforms, becoming evident the need to improve the semantic expressiveness of the DW. Research has shown that ontological theories can play a fundamental role in improving the quality of conceptual models, reinforcing their potential to support semantic interoperability in its various manifestations. In this paper we propose the application of ontological patterns, grounded in the Unified Foundational Ontology (UFO), for conceptual modeling in multidimen-sional models, in or...
As we all know that developing a data warehouse is a costly affair in terms of resources and investment. This cost can be recurring as well as non recurring depending on the requirements of the organization as a whole. The warehouse is... more
As we all know that developing a data warehouse is a costly affair in terms of resources and investment. This cost can be
recurring as well as non recurring depending on the requirements of the organization as a whole. The warehouse is supposed to
give benefits to the business as a fixed cost is associated with it. This paper discusses the need of a warehouse and its potential
benefits by studying the data warehouse process since each data warehouse process changes as the organizational need
changes. We have identified success factors of warehouse and calculated it to define ROI.
- by Thomas Djotio
- •
- Semantic Web, Datawarehouse, Sid, GIS
In light of high cost and higher rate of failure of Datawarehousing projects, it becomes imperative to study software processes being followed for Datawarehousing. In this paper we present a survey of literature for Datawarehousing... more
In light of high cost and higher rate of failure of Datawarehousing projects, it becomes imperative to study software processes being followed for Datawarehousing. In this paper we present a survey of literature for Datawarehousing requirement gathering and testing. This paper has analyzed drawbacks of traditional techniques for requirement gathering and testing of Datawarehouse. We have reported areas where more research needs to be focused. Using text analytics technique called “word cloud”, we have analyzed main areas being researched and shown areas that need more focus. This paper can give a direction to future research in the areas of Datawarehouse requirement gathering and testing.
This book “Information Management” provides a exposure towards basics of information, database design and modelling, addresses the issues in information governance and integration. It provides a maiden study on core relational database... more
This book “Information Management” provides a exposure towards basics of information, database design and modelling, addresses the issues in information governance and integration. It provides a maiden study on core relational database design and modelling. It provides a pervasive over creating, maintaining and performance evaluation of Bigdata environments like master data management, data warehouse.
- by P Krishna Sankar and +1
- •
- Information Management, Databases, NoSQL, Hadoop
- by Edwin Cardoso
- •
- Ecommerce, Mis, TIC, ERP
The objective of this paper is identifying a warehouse model to build an analytical framework and analyze different important parameters which directly impact the changes of share market. We identify parameters that represent different... more
The objective of this paper is identifying a warehouse model to build an analytical framework and analyze different important parameters which directly impact the changes of share market. We identify parameters that represent different viewing windows and perspectives towards stock market performance and movement trends. We categorize and define many intrinsic as well as external factors that may affect stock market as a whole. Sensex and Nifty are used as the pulse of Indian stock market. In this paper, we focus on defining a suitable OLAP model which can cater all the parameters that affect share market. We also identify different applications of this analytical
model for forecasting information to help decision making.
A data warehouse is an integrated set of data, derived basically from operational data to use in decision making strategy and business intelligence using (OLAP) techniques. Most of the creation of data multidimensional data warehouses is... more
A data warehouse is an integrated set of data, derived basically from operational data to use in decision making strategy and business intelligence using (OLAP) techniques. Most of the creation of data multidimensional data warehouses is done manually, but it is a very complex and takes a long time and usually has a risk of fail. In addition, a set of complex mappings must be performed. Despite this, there is no noticeable efforts has been done in order to find a practical solution structured to resolve the issue. To overcome, the proposed method presents a new strategy to automate the multidimensional design of Data Warehouses; this algorithm uses an enterprise schema of an operational database as a starting point to design data warehouse schema. As a result, the user has to choose the candidate schema which meets the system requirements.
- by Reyan M. Zein and +1
- •
- Datawarehouse
Palang Merah Indonesia adalah lembaga yang diakui oleh pemerintah yang bergerak dalam kegiatan sosial kemanusiaan dan menjalankan tugas khususnya dalam pelayanan transfusi darah. Berbagai Masalah dalam kegiatan transfusi darah juga... more
Palang Merah Indonesia adalah lembaga yang diakui oleh pemerintah yang bergerak dalam kegiatan sosial kemanusiaan dan menjalankan tugas khususnya dalam pelayanan transfusi darah. Berbagai Masalah dalam kegiatan transfusi darah juga dihadapi Palang Merah Indonesia Daerah Sulawesi Utara yang merupakan bagian dari perhimpunan Palang Merah Indonesia. Masalah yang di hadapi diantaranya adalah masalah dalam proses pengambilan keputusan dalam menentukan rencana strategis. Dalam penelitian ini penulis merancang sebuah sistem Business Intelligence dan Data Warehouse yang ditujukan untuk memfasilitasi pihak Palang Merah Indonesia Daerah Sulawesi Utara dalam proses pengambilan keputusan. Hasil dari perancangan yang dibuat adalah sebuah sistem yang berisi tampilan yang memuat informasi dalam bentuk Tabel dan Grafik Transaksi pendonoran darah dan permintaan darah di Propinsi Sulawesi Utara. Tujuan dirancangnya sistem Business Intelligence dan Data Warehouse untuk membantu pihak manajemen dalam menganalisa dan merencanakan suatu rencana strategis yang mampu meningkatkan kualitas kerja dalam kegiatan donor darah diseluruh propinsi Sulawesi Utara. Business Intelligence sendiri merupakan serangkaian kegiatan untuk merepresentasikan dan menganalisa informasi yang telah dikumpulkan dan diolah dalam data warehouse sehingga dapat digunakan dalam proses pengambilan keputusan dengan sebaik mungkin.
The Indonesian Red Cross is an institution recognized by the government which is engaged in humanitarian social activities and carries out its duties specifically in blood transfusion services. Various problems in blood transfusion activities are also faced by the Indonesian Red Cross Region of North Sulawesi which is part of the Indonesian Red Cross association. Problems faced include problems in the decision making process in determining strategic plans. In this study the authors designed a Business Intelligence and Data Warehouse system aimed at facilitating the Indonesian Red Cross Region of North Sulawesi in the decision making process. The result of the design made is a system that contains a display that contains information in the form of Tables and Charts of Blood Donation Transactions and Blood Demand in North Sulawesi Province. The purpose of designing a Business Intelligence and Data Warehouse system is to assist management in analyzing and planning a strategic plan that is able to improve the quality of work in blood donor activities throughout the province of North Sulawesi. Business Intelligence itself is a series of activities to represent and analyze information that has been collected and processed in a data warehouse so that it can be used in the decision making process as well as possible.
- by Mar Gar
- •
- Datawarehouse, Compras, Logística
La presente investigación surge como parte de la colaboración existente entre la Universidad de las Ciencias Informáticas y la Oficina Nacional de Estadísticas. Esta última es el órgano rector de la estadística en Cuba y la responsable de... more
La presente investigación surge como parte de la colaboración existente entre la Universidad de las
Ciencias Informáticas y la Oficina Nacional de Estadísticas. Esta última es el órgano rector de la
estadística en Cuba y la responsable de gestionar los principales indicadores en distintas áreas de
nuestro país entre las que se encuentra la Agricultura, ganadería y silvicultura. En la actualidad, las
herramientas que se utilizan para la recolección y gestión de la información en la Oficina Nacional de
Estadísticas presentan deficiencias, influyendo negativamente en los datos estadísticos. En el presente
Trabajo de Diploma se tiene como objetivo principal desarrollar un Mercado de Datos para el área de
Agricultura, ganadería y silvicultura del Sistema de Información de Gobierno que contribuya al
almacenamiento homogéneo de los datos, posibilitando un adecuado análisis de la información . Para
dar cumplimiento a ello, se realizó una caracterización de las metodologías, herramientas y tecnologías
a utilizar en el desarrollo de los Almacenes de Datos. De igual manera se analizó y diseñó un Mercado
de Datos para dicha área. Partiendo de los elementos de análisis y diseño se llevó a cabo la
implementación, obteniéndose como resultado un Mercado de Datos que cumple con los
requerimientos de información solicitados por los especialistas de la Oficina Nacional de Estadísticas.
Por último, se validó el Mercado de Datos implementado aplicando el Modelo V, listas de chequeo y los
casos de pruebas diseñados.
El presente proyecto consiste en implementar un Data Mart, tecnología que forman parte de la Inteligencia de Negocios, aplicando metodología ágil BEAM* (Business Event Analysis & Modeling) para el diseño de Data Mart, los datos utilizados... more
El presente proyecto consiste en implementar un Data Mart, tecnología que forman parte de la Inteligencia de Negocios, aplicando metodología ágil BEAM* (Business Event Analysis & Modeling) para el diseño de Data Mart, los datos utilizados como fuentes de información se obtendrán de la base de datos de la plataforma virtual académica MOODLE de la Universidad Tecnológica de Panamá (UTP), con el propósito de optimizar el proceso de extracción y transformación de datos en información relevante y útil que sirvan de apoyo en la toma de decisiones estratégicas a los directivos de la UTP.
- by Horacio Kuna
- •
- OLAP, Datawarehouse
Este artículo trata sobre bibliominería de datos, que es minería de datos aplicada a grandes volúmenes de datos disponibles en las bibliotecas, como resultado de la operación de los principales sistemas transaccionales, tales como... more
Este artículo trata sobre bibliominería de datos, que es minería de datos aplicada a grandes volúmenes de datos disponibles en las bibliotecas, como resultado de la operación de los principales sistemas transaccionales, tales como préstamos, referencia, adquisiciones, entre otros. Así, la bibliominería de datos es el proceso que tiene como propósito descubrir, extraer y almacenar información relevante de grandes bases de datos existentes en las bibliotecas, mediante la utilización de programas de búsqueda e identificación de patrones y relaciones globales, tendencias, desviaciones y otros indicadores que pueden extraerse mediante distintas técnicas de minería de datos. Es importante señalar que para el análisis bibliométrico se requiere de la participación de equipos interdisciplinarios formados por ingenieros de sistemas, estadísticos y bibliotecólogos.
Kemajuan teknologi informasi saat ini menjadi begitu cepat searah dengan kebutuhan informasi yang semakin tinggi. Konsultan IT merupakan pihak yang banyak membantu kebutuhan proses bisnis tertentu dalam mempermudah dari segala aspek. Baik... more
Kemajuan teknologi informasi saat ini menjadi begitu cepat searah dengan kebutuhan informasi yang semakin tinggi. Konsultan IT merupakan pihak yang banyak membantu kebutuhan proses bisnis tertentu dalam mempermudah dari segala aspek. Baik dari sisi network, pencatatan atau pembukuan, manajemen data, reporting, infrastruktur dan lain sebagainya. Dengan adanya penelitian ini diharapkan mampu menawarkan pemodelan multi-dimensi datawarehouse yang pada akhirnya dapat menjadi alat bantu dalam pengambilan keputusan. Pemodelan yang coba dilakukan adalah dengan mengidentifikasi proses bisnis baik dari sisi project maupun SDM yang dimiliki lalu kemudian dilanjutkan dengan pembentukan dimensi dan jenis perhitungan fact yang akan digunakan pada masing subjek yang ada. Penelitian ini memaparkan teknik membangun model multi-dimensi data, datawarehouse dan business intelligence. Semuanya akan dibahas dengan diawali dengan mengidentifikasi fokus proses bisnis, pemodelan multi-dimensi hingga mendesain visualisasi tampilan. Keseluruhan proses tersebut berhasil menjadikan tampilan visualisasi dari keterkaitan dimensi dan fact menjadi sebuah alat untuk mendukung pengambilan keputusan dalam perusahaan konsultan IT yang dapat dilihat dari berbagai dimensi dan kategori.
– The library.umn.ac.id is the library website which is owned by the Universitas Multimedia Nusantara. The website has never been examined with usability especially its user interfaces. In conducting the evaluation, the methods of... more
– The library.umn.ac.id is the library website which is owned by the Universitas Multimedia Nusantara. The website has never been examined with usability especially its user interfaces. In conducting the evaluation, the methods of usability tests are applied that is users' observations of the website user interface design continued by collecting and analyzing data. In addition, data is collecting by creating a questionnaire which is based on System Usability Scale (SUS) to measure users' satisfaction when using the system. The data is analyzed by applying usability test that refers to the theory of Jacob Nielsen (2003) which is composed of five components that is learnability, efficiency, memorability, errors, and satisfaction. The results of this research, some new user interfaces are recommended for enhancing the library website that has been fully the old version. As a result, the user interfaces prototype is easy to use and easy to understand based on the students' responses compared to the old version.
Volume and complexity of data collected in datawarehouse systems is growingrapidly. This is posing challenges to traditional datawarehouse platforms. At the same time, Hadoop ecosystem has opened new avenues for implementing... more
Volume and complexity of data collected in datawarehouse systems is growingrapidly. This is posing challenges to traditional datawarehouse platforms. At the same time, Hadoop ecosystem has opened new avenues for implementing datawarehouse systems on Hadoop and overcome these challenges. In this paper we survey previous studies about limitations of traditional datawarehouse platforms. Opportunities offered by Hadoop for datawarehouse implementation are discussed. This paper can give a direction to future research in the areas of Datawarehouse implementation on Hadoop platform.
Data warehouses (DW) play a decisive role in providing analytical information for decision making. Multidimensional modeling is a special approach to modeling data, considered the foundation for building data warehouses. With the... more
Data warehouses (DW) play a decisive role in providing analytical information for decision making. Multidimensional modeling is a special approach to modeling data, considered the foundation for building data warehouses. With the explosive growth in the amount of heterogeneous data (most of which external to the organization) in the latest years, the DW has been impacted by the need to interoperate and deal with the complexity of this new type of information, such as big data, data lakes and cognitive computing platforms, becoming evident the need to improve the semantic expressiveness of the DW. Research has shown that ontological theories can play a fundamental role in improving the quality of conceptual models, reinforcing their potential to support semantic interoperability in its various manifestations. In this paper we propose the application of ontological patterns, grounded in the Unified Foundational Ontology (UFO), for conceptual modeling in multidimen-sional models, in order to improve the semantic expressiveness of the models used to represent analytical data in a DW.
The database has started in the 1960s to make designing, building, and maintaining easily for information system difficulties. Since this time the database uses as storage for data and information and salves the problem about saving... more
The database has started in the 1960s to make designing,
building, and maintaining easily for information system
difficulties. Since this time the database uses as storage
for data and information and salves the problem about
saving them safely. The dramatically increase in
governments and companies transactions meet by
increase in their databases, data storage and quires which
used to retrieve data from database. They use
information processing system which is used for storage
of everyday activities about them. However, information
processing systems rely on online transaction processing
(OLTP) in DB, which is not so easily accessible to the
governments and companies' users. Moreover, relational
database was not designed to support multi dimensional
view. Need for Multi dimensional view, Online
Analytical Processing (OLAP) and reducing time
consuming for reports generating leads to the concept of
a data warehouse. This study convert database into data
warehouse based on a star schema structure by using
several tools and techniques as software and hardware.
We investigate how star schema makes fast respond for
quire and for better performance. The star schema
structure data can be viewed and analyzed as multi
dimensional view and can be used for Online Analytical
Processing
- by Mohammed Mohammed and +1
- •
- Databases, Datawarehouse
As a permanent storage for business process transaction, database is a crucial and the needed for the system. Using database often does not match with the ability and functionality and even is it possible as theory said about using... more
As a permanent storage for business process transaction, database is a crucial and the needed for the system. Using database often does not match with the ability and functionality and even is it possible as theory said about using transaction database and beyond the advantages and disadvantages, separating using between transactional database and database for decision making will mine the ability and the powerful database as much as possible. Beside that daily transaction will increase the database capacity month by month and year by year and will decrease the performance, especially for customer daily services. Separating between database transaction and database for decision making will decrease connection to daily database transaction and increase daily database transaction as which is run by application and will implicate the increasing customer satisfaction. Moreover making the strategic reports for decision making never ever become a nightmare and unimportant thing. Differentiation efficiency for saving the amount of data byte and effectiveness the query speed in sql statement in order to make the decision making reports will be used as an approach for justification.
Este artículo trata sobre bibliominería de datos, que es minería de datos aplicada a grandes volúmenes de datos disponibles en las bibliotecas, como resultado de la operación de los principales sistemas transaccionales, tales como... more
Este artículo trata sobre bibliominería de datos, que es minería de datos aplicada a grandes volúmenes de datos disponibles en las bibliotecas, como resultado de la operación de los principales sistemas transaccionales, tales como préstamos, referencia, adquisiciones, entre otros. Así, la bibliominería de datos es el proceso que tiene como propósito descubrir, extraer y almacenar información relevante de grandes bases de datos existentes en las bibliotecas, mediante la utilización de programas de búsqueda e identificación de patrones y relaciones globales, tendencias, desviaciones y otros indicadores que pueden extraerse mediante distintas técnicas de minería de datos. Es importante señalar que para el análisis bibliométrico se requiere de la participación de equipos interdisciplinarios formados por ingenieros de sistemas, estadísticos y bibliotecólogos.
Keyword: Data Warehouse (DW), Materialized Views (MV), Relational Database Management System (RDBMS), Unified Modeling Language (UML), On-Line Analytical Processing (OLAP). ABSTRACT The OLAP operation are very useful for application and... more
Keyword: Data Warehouse (DW), Materialized Views (MV), Relational Database Management System (RDBMS), Unified Modeling Language (UML), On-Line Analytical Processing (OLAP). ABSTRACT The OLAP operation are very useful for application and query different dataset ,the operators aggregate data from variant levels which can be efficiently used for data presentation in data warehouse environment. Materialized view another important issues fore DW and OLAP operates, materialized views are found useful for fast query processing. The process of updating materialized view in response to change is a great challenge in data warehousing. Maintenance of materialized views efficiently with OLAP operators is also a challenge today in the field of RDBMS as well as data warehouse. In this paper, we discuss capabilities of OLAP operator, materialized view maintenance; achieve the principle of consistency MV with DW, view maintenance action with OLAP operators. The proposed system is implemented using Microsoft visual studio C#.NET2010programming language with embedded SQL server management studio 2008 R2, and all test results are found as close as they were expected. The results proved the correctness of system design and its reasonable considerations and choices.
- by bilal adil
- •
- Datawarehouse
The industry trend towards self-service business intelligence is impeded by the absence, in commercially-available information systems, of automated identification of potential issues with summarization operations. Research on statistical... more
The industry trend towards self-service business intelligence is impeded by the absence, in
commercially-available information systems, of automated identification of potential issues
with summarization operations. Research on statistical databases and on data warehouses
have both produced widely-accepted categorisations of measure attributes, the former based
on general summarizability properties and the latter based on measures' additivity properties. We
demonstrate that neither of these categorisations is an appropriate basis for precise identification
of measure types since they are incomplete, ambiguous and insufficiently refined.
Using a new categorisation of dimension types and multidimensional structures, we derive a
measure categorisation which is a synthesis and a refinement of the two aforementioned
categorisations. We give formal definitions for our summarizability types, based on the
relational model of data, and then construct rules for correct summarization by using these
definitions. We also give a method to detect whether a given MDX OLAP query conforms to those
rules.
The problem that discussed in this study are university management have difficulties in getting report on which location most student enrollment came,which time had the most students enrollment and which... more
The problem that discussed in this study are university management have difficulties in getting report on which location most student enrollment came,which time had the most students enrollment and which marketing promotion get the most students. This study was design to develop a dashboard marketing application to solve the problem on university management. The methodology that this study use is experimental methods that consist of 3 methods, from collecting data using interview and requirement elicitation form, designing using use case diagram until implementation by creating the application base on PHP, MySQL and fusioncart. Result of this study is by using dashboard marketing system, university have their own student enrollment database where they can trace and evaluate three part of marketing intelligence that is location, time and channel of information.Marketing division can segmented their market, while university management can view the result on the dashboard.
Palang Merah Indonesia adalah lembaga yang diakui oleh pemerintah yang bergerak dalam kegiatan sosial kemanusiaan dan menjalankan tugas khususnya dalam pelayanan transfusi darah. Berbagai Masalah dalam kegiatan transfusi darah juga... more
Palang Merah Indonesia adalah lembaga yang diakui oleh pemerintah yang bergerak dalam kegiatan sosial kemanusiaan dan menjalankan tugas khususnya dalam pelayanan transfusi darah. Berbagai Masalah dalam kegiatan transfusi darah juga dihadapi Palang Merah Indonesia Daerah Sulawesi Utara yang merupakan bagian dari perhimpunan Palang Merah Indonesia. Masalah yang di hadapi diantaranya adalah masalah dalam proses pengambilan keputusan dalam menentukan rencana strategis. Dalam penelitian ini penulis merancang sebuah sistem Business Intelligence dan Data Warehouse yang ditujukan untuk memfasilitasi pihak Palang Merah Indonesia Daerah Sulawesi Utara dalam proses pengambilan keputusan. Hasil dari perancangan yang dibuat adalah sebuah sistem yang berisi tampilan yang memuat informasi dalam bentuk Tabel dan Grafik Transaksi pendonoran darah dan permintaan darah di Propinsi Sulawesi Utara. Tujuan dirancangnya sistem Business Intelligence dan Data Warehouse untuk membantu pihak manajemen dalam m...
In recent years, there is a tremendous growth in data volume. Also many other sources than traditional structured data ex. log files, web data, stream data and sensor data need to be stored in the OLTP. It is not suitable for an... more
In recent years, there is a tremendous growth in data volume. Also many other sources than traditional structured data ex. log files, web data, stream data and sensor data need to be stored in the OLTP. It is not suitable for an organization to neglect valuable information from these sources. ETL tools extract meaningful information from various data sources, various transformations of data are carried out in transformation phase and then load into the data warehouse. Traditionally, commercial ETL (Extract –transform-load) tools ex. Informatica, Micro strategy ,Pentho etc. used to transfer OLTP data to other database known as data warehouse . MapReduce technology is becoming popular among people with
specialty of ETL task compelling organization to gain benefits from it. Hadoop, an open source MapReduce framework, is capable of handling massive data, provide cheap storage, process structured as well as unstructured data and has massive scalability. It can be seen as viable alternative for migrating ETL job. Although Hadoop is beneficial for large scale industries, many small organizations having small amount of data is also looking for leveraging their business intelligence on it. In this paper, we will explore the new opportunities of utilizing Hadoop for performing business intelligence with specifically ETL phase of datawarehouse.
- by IAEME Publication
- •
- Hadoop, Datawarehouse, Mapreduce, ETL
Este artículo trata sobre bibliominería de datos, que es minería de datos aplicada a grandes volúmenes de datos disponibles en las bibliotecas, como resultado de la operación de los principales sistemas transaccionales, tales como... more
Este artículo trata sobre bibliominería de datos, que es minería de datos aplicada a grandes volúmenes de datos disponibles en las bibliotecas, como resultado de la operación de los principales sistemas transaccionales, tales como préstamos, referencia, adquisiciones, entre otros. Así, la bibliominería de datos es el proceso que tiene como propósito descubrir, extraer y almacenar información relevante de grandes bases de datos existentes en las bibliotecas, mediante la utilización de programas de búsqueda e identificación de patrones y relaciones globales, tendencias, desviaciones y otros indicadores que pueden extraerse mediante distintas técnicas de minería de datos. Es importante señalar que para el análisis bibliométrico se requiere de la participación de equipos interdisciplinarios formados por ingenieros de sistemas, estadísticos y bibliotecólogos.
Volume and complexity of data collected in datawarehouse systems is growing rapidly. This is posing challenges to traditional datawarehouse platforms. At the same time, Hadoop ecosystem has opened new avenues for implementing... more
Volume and complexity of data collected in datawarehouse systems is growing rapidly. This is posing challenges to traditional datawarehouse platforms. At the same time, Hadoop ecosystem has opened new avenues for implementing datawarehouse systems on Hadoop and overcome these challenges. In this paper we survey previous studies about limitations of traditional datawarehouse platforms. Opportunities offered by Hadoop for datawarehouse implementation are discussed. This paper can give a direction to future research in the areas of Datawarehouse implementation on Hadoop platform.
Poor performance can turn a successful data warehousing project into a failure. Consequently, several attempts have been made by various researchers to deal with the problem of scheduling the ExtractTransform-Load (ETL) process. In this... more
Poor performance can turn a successful data warehousing project into a failure. Consequently, several attempts have been made by various researchers to deal with the problem of scheduling the ExtractTransform-Load (ETL) process. In this paper we therefore present several approaches in the context of enhancing the data warehousing Extract, Transform and loading stages. We focus on enhancing the performance of extract and transform phases by proposing two algorithms that reduce the time needed in each phase through employing the hidden semantic information in the data. Using the semantic information, a large volume of useless data can be pruned in early design stage. We also focus on the problem of scheduling the execution of the ETL activities, with the goal of minimizing ETL execution time. We explore and invest in this area by choosing three scheduling techniques for ETL. Finally, we experimentally show their behavior in terms of execution time in the sales domain to understand the impact of implementing any of them and choosing the one leading to maximum performance enhancement.
Poor performance can turn a successful data warehousing project into a failure. Consequently, several attempts have been made by various researchers to deal with the problem of scheduling the Extract-Transform-Load (ETL) process. In this... more
Poor performance can turn a successful data warehousing project into a failure. Consequently, several attempts have been made by various researchers to deal with the problem of scheduling the Extract-Transform-Load (ETL) process. In this paper we therefore present several approaches in the context of enhancing the data warehousing Extract, Transform and loading stages. We focus on enhancing the performance of extract and transform phases by proposing two algorithms that reduce the time needed in each phase through employing the hidden semantic information in the data. Using the semantic information, a large volume of useless data can be pruned in early design stage. We also focus on the problem of scheduling the execution of the ETL activities, with the goal of minimizing ETL execution time. We explore and invest in this area by choosing three scheduling techniques for ETL. Finally, we experimentally show their behavior in terms of execution time in the sales domain to understand the impact of implementing any of them and choosing the one leading to maximum performance enhancement.
ABSTRACT Data mining is a process of getting out useful information from data stacks. One of the most common application areas is to use classification of algorithms that estimate the future events by past experiences. In this context, in... more
ABSTRACT Data mining is a process of getting out useful information from data stacks. One of the most common application areas is to use classification of algorithms that estimate the future events by past experiences. In this context, in order to predict future events, a data warehouse is created by using the background of students which includes demographic, personal, school, and course information of students. On this data warehouse by using classification algorithms, new applications which can make inferences for the future could be developed. Aims of this study are to create student data warehouse which can be used data mining algorithms, to improve an early warning system that may estimate students' the future academic successes for students and also for their families and to find out primary factors affecting their academic success.
As a permanent storage for business process transaction, database is a crucial and the needed for the system. Using database often does not match with the ability and functionality and even is it possible as theory said about using... more
As a permanent storage for business process transaction, database is a crucial and the needed for the system. Using database often does not match with the ability and functionality and even is it possible as theory said about using transaction database and beyond the advantages and disadvantages, separating using between transactional database and database for decision making will mine the ability and the powerful database as much as possible. Beside that daily transaction will increase the database capacity month by month and year by year and will decrease the performance, especially for customer daily services. Separating between database transaction and database for decision making will decrease connection to daily database transaction and increase daily database transaction as which is run by application and will implicate the increasing customer satisfaction. Moreover making the strategic reports for decision making never ever become a nightmare and unimportant thing. Different...