Ricardo J Santos | Universidade de Coimbra (original) (raw)
Papers by Ricardo J Santos
ACM SIGMOD Record, 2014
Databases often support enterprise business and store its secrets. This means that securing them ... more Databases often support enterprise business and store its secrets. This means that securing them from data damage and information leakage is critical. In order to deal with intrusions against database systems, Database Intrusion Detection Systems (DIDS) are frequently used. This paper presents a survey on the main database intrusion detection techniques currently available and discusses the issues concerning their application at the database server layer. The identified weak spots show that most DIDS inadequately deal with many characteristics of specific database systems, such as ad hoc workloads and alert management issues in data warehousing environments, for example. Based on this analysis, research challenges are presented, and requirements and guidelines for the design of new or improved DIDS are proposed. The main finding is that the development and benchmarking of specifically tailored DIDS for the context in which they operate is a relevant issue, and remains a challenge. W...
Proceedings of the 15th Symposium on International Database Engineering & Applications - IDEAS '11, 2011
Data Warehouses (DWs) are the enterprise's most valuable asset in what concerns critical busi... more Data Warehouses (DWs) are the enterprise's most valuable asset in what concerns critical business information, making them an appealing target for attackers. Packaged database encryption solutions are considered the best solution to protect sensitive data. However, given the volume of data typically processed by DW queries, the existing encryption solutions heavily increase storage space and introduce very large overheads in
Lecture Notes in Computer Science, 2013
The development of nuclear power plant simulators is a hard task where many different factors hav... more The development of nuclear power plant simulators is a hard task where many different factors have to be considered. Software components can be a main factor in the improvement of the development cycle of complex simulators. This paper presents a new generation of simulators based on two different component technologies. On the one hand, RT-CORBA is a communication middleware, which allows the development of predictable real-time applications taking benefit from CORBA features. On the other hand, CCA is a new component model for scientific components, which allows high performance computing needed for these simulators.
2012 IEEE 36th Annual Computer Software and Applications Conference, 2012
Real-time Data Warehouses (DWs) must be able to deal with continuous updates while ensuring 24/7 ... more Real-time Data Warehouses (DWs) must be able to deal with continuous updates while ensuring 24/7 availability. To improve their performance, distributing data using round-robin algorithms on clusters of shared-nothing machines is normally used. This paper proposes a solution for distributed DW databases that ensures its continuous availability and deals with frequent data loading requirements, while adding small performance overhead. We use a data striping and replication architecture to distribute portions of each fact table among pairs of slave nodes, where each slave node is an exact replica of its partner. This allows balancing query execution and replacing any defective node, ensuring the system's continuous availability. The size of each portion in a given node depends on its individual features, namely performance benchmark measures and dedicated database RAM. The estimated cost for executing each query workload in each slave node is also used for balancing query performance. We include experiments using the TPC-H decision support benchmark to evaluate the scalability of the proposed solution and show that it outperforms standard round-robin distributed DW setups.
2011 IEEE 35th Annual Computer Software and Applications Conference, 2011
Technological evolution has redefined many business models. Many decision makers are now required... more Technological evolution has redefined many business models. Many decision makers are now required to act near real-time, instead of periodically, given the latest transactional information. Decision-making occurs much more frequently and considers the latest business data. Since data warehouses (DWs) are the core of business intelligence, decision support systems need to deal with 24/7 real-time requirements. Thus, the ability to
Proceedings of the 2008 international symposium on Database engineering & applications - IDEAS '08, 2008
ABSTRACT A data warehouse provides information for analytical processing, decision making and dat... more ABSTRACT A data warehouse provides information for analytical processing, decision making and data mining tools. As the concept of real-time enterprise evolves, the synchronism between transactional data and data warehouses, statically implemented, has been redefined. ...
International Conference on Enterprise Information Systems, 2009
Performance optimization of decision support queries has always been a major issue in data wareho... more Performance optimization of decision support queries has always been a major issue in data warehousing. A large amount of wide-ranging techniques have been used in research to overcome this problem. Bit-based techniques such as bitmap indexes and bitmap join indexes have been used and are generally accepted as standard common practice for optimizing data warehouses. These techniques are very promising due to their relatively low overhead and fast bitwise operations. In this paper, we propose a new technique which performs optimized row selection for decision support queries, introducing a bit-based attribute into the fact table. This attribute's value for each row is set according to its relevance for processing each decision support query by using bitwise operations. Simply inserting a new column in the fact table's structure and using bitwise operations for performing row selection makes it a simple and practical technique, which is easy to implement in any Database Management System. The experimental results, using benchmark TPC-H, demonstrates that it is an efficient optimization method which significantly improves query performance.
EUROCON 2011 - International Conference on Computer as a Tool - Joint with Conftele 2011, 2011
The sudden fall of blood pressure (hypotension - HT) is a common complication in medical care. In... more The sudden fall of blood pressure (hypotension - HT) is a common complication in medical care. In critical care patients, HT may cause serious neurological, heart, or endocrine disorders, inducing severe or even lethal events. Recent studies report an increase of mortality in HT prone hemodialysis patients in need of critical care. Predicting HT episodes in advance is crucial to enable medical staff to minimize its effects or even avoid its occurrence. Most medical systems have focused on monitoring and detecting current patient status, rather than determining biosignal trends or predicting the patient's future status. Therefore, predicting HT episodes in advance remains a challenge. In this paper, we present a solution for continuous monitoring and efficient prediction of HT episodes. We propose an architecture for a HT Predictor (HTP) Tool, capable of continuously storing and real-time monitoring all patient's heart rate and blood pressure biosignal data, alerting probable occurrences of each patient's HT episodes for the following 60 minutes, based on non-invasive hemodynamic variables. Our system also promotes medical staff mobility, taking advantage of using mobile personal devices such as cell phones and PDA's. An experimental evaluation on real-life data from the well-known Physionet database shows the tool's efficiency, outperforming the winning proposal of the Physionet 2009 Challenge.
EUROCON 2011 - International Conference on Computer as a Tool - Joint with Conftele 2011, 2011
Data Warehouses (DWs) are the enterprise's most valuable assets in what concerns critical busines... more Data Warehouses (DWs) are the enterprise's most valuable assets in what concerns critical business information, making them an appealing target for malicious inside and outside attackers. Given the volume of data and the nature of DW queries, most of the existing data security solutions for databases are inefficient, consuming too many resources and introducing too much overhead in query response time, or resulting in too many false positive alarms (i.e., incorrect detection of attacks) to be checked. In this paper, we present a survey on currently available data security techniques, focusing on specific issues and requirements concerning their use in data warehousing environments. We also point out challenges and opportunities for future research work in this field.
2006 10th International Database Engineering and Applications Symposium (IDEAS'06), 2006
Diseases such as avian influenza, severe acute respiratory syndrome (SARS) and Creutzfeldt-Jacob ... more Diseases such as avian influenza, severe acute respiratory syndrome (SARS) and Creutzfeldt-Jacob syndrome represent a new era of biological threats. Nowadays, these hazards breed, mutate and evolve at tremendous speed. Furthermore, they may spread out at the same speed as which we travel. This reveals an urgent need for an agent capable of dealing with such threats. Data warehouses are databases which provide decision support by on-line analytical processing (OLAP) techniques. We present the architecture for an effective information system infrastructure enabling the prediction and near real-time detection of disease outbreaks, using knowledge extraction algorithms to explore a symptoms/diseases data warehouse in a continuous and active form. To collect such data, we take advantage of the Internet and features existing in today's common communication devices such as personal computers, portable digital assistants and cellular phones. We present a case-simulation based on a small country, showing the system can detect an outbreak within hours or even minutes after its physical occurrence, alerting health decision makers and providing quick interaction and feedback between all users. The architecture is also functionally independent from its geographical dimension.
Handbook of Research on Computational Intelligence for Engineering, Science, and Business, 2013
This chapter presents the fundamentals of a hardware based memory network that can perform comple... more This chapter presents the fundamentals of a hardware based memory network that can perform complex cognitive tasks. The network is designed to provide space dimensionality reduction, which enables desired functionality in a random environment. Complex network functionality is achieved by simple network cells that minimize the needed chip area for hardware implementation. Functionality of this network is demonstrated by automatic character recognition with various input deformations. In the character recognition, the network is trained to recognize characters deformed by random noise, rotation, scaling, and shifting. This example demonstrates how cognitive functionality of a hardware network can be achieved through an evolutionary process, as distinct from design based on mathematical formalism.
Lecture Notes in Computer Science, 2009
On-line analytical processing against data warehouse databases is a common form of getting decisi... more On-line analytical processing against data warehouse databases is a common form of getting decision making information for almost every business field. Decision support information oftenly concerns periodic values based on regular attributes, such as sales amounts, percentages, most transactioned items, etc. This means that many similar OLAP instructions are periodically repeated, and simultaneously, between the several decision makers. Our Query Cache Tool takes advantage of previously executed queries, storing their results and the current state of the data which was accessed. Future queries only need to execute against the new data, inserted since the queries were last executed, and join these results with the previous ones. This makes query execution much faster, because we only need to process the most recent data. Our tool also minimizes the execution time and resource consumption for similar queries simultaneously executed by different users, putting the most recent ones on hold until the first finish and returns the results for all of them. The stored query results are held until they are considered outdated, then automatically erased. We present an experimental evaluation of our tool using a data warehouse based on a real-world business dataset and use a set of typical decision support queries to discuss the results, showing a very high gain in query execution time.
2011IEEE 10th International Conference on Trust, Security and Privacy in Computing and Communications, 2011
Data Warehouses (DWs) store the golden nuggets of the business, which makes them an appealing tar... more Data Warehouses (DWs) store the golden nuggets of the business, which makes them an appealing target. To ensure data privacy, encryption solutions have been used and proven efficient in their security purpose. However, they introduce massive storage space and performance overheads, making them unfeasible for DWs. We propose a data masking technique for protecting sensitive business data in DWs that balances security strength with database performance, using a formula based on the mathematical modular operator. Our solution manages apparent randomness and distribution of the masked values, while introducing small storage space and query execution time overheads. It also enables a false data injection method for misleading attackers and increasing the overall security strength. It can be easily implemented in any DataBase Management System (DBMS) and transparently used, without changes to application source code. Experimental evaluations using a real-world DW and TPC-H decision support benchmark implemented in leading commercial DBMS Oracle 11g and Microsoft SQL Server 2008 demonstrate its overall effectiveness. Results show substantial savings of its implementation costs when compared with state of the art data privacy solutions provided by those DBMS and that it outperforms those solutions in both data querying and insertion of new data.
Proceedings of the 2009 International Database Engineering & Applications Symposium on - IDEAS '09, 2009
The purpose of a data warehouse is to aid decision making. As the real-time enterprise evolves, s... more The purpose of a data warehouse is to aid decision making. As the real-time enterprise evolves, synchronism between transactional data and data warehouses is redefined. To cope with realtime requirements, the data warehouses must be able to enable continuous data integration, in order to deal with the most recent business data. Traditional data warehouses are unable to support any dynamics in structure and content while they are available for OLAP. Their data is periodically updated because they are unprepared for continuous data integration. For real-time enterprises with needs in decision support while the transactions are occurring, (near) real-time data warehousing seem very promising. In this paper we present a survey on testing today's most used loading techniques and analyze which are the best data loading methods, presenting a methodology for efficiently supporting continuous data integration for data warehouses. To accomplish this, we use techniques such as table structure replication with minimum content and query predicate restrictions for selecting data, to enable loading data in the data warehouse continuously, with minimum impact in query execution time. We demonstrate the efficiency of the method using benchmark TPC-H and executing query workloads while simultaneously performing continuous data integration.
Lecture Notes in Computer Science, 2012
Data Warehouses (DWs) are the core of enterprise sensitive data, which makes protecting confident... more Data Warehouses (DWs) are the core of enterprise sensitive data, which makes protecting confidentiality in DWs a critical task. Published research and best practice guides state that encryption is the best way to achieve this and maintain high performance. However, although encryption algorithms strongly fulfill their security purpose, we demonstrate that they introduce massive storage space and response time overheads, which mostly result in unacceptable security-performance tradeoffs, compromising their feasibility in DW environments. In this paper, we enumerate state-of-the-art data masking and encryption solutions and discuss the issues involving their use from a data warehousing perspective. Experimental evaluations using the TPC-H decision support benchmark and a real-world sales DW support our remarks, implemented in Oracle 11g and Microsoft SQL Server 2008. We conclude that the development of alternate solutions specifically tailored for DWs that are able to balance security with performance still remains a challenge and an open research issue.
Proceedings of the Fourteenth International Database Engineering & Applications Symposium on - IDEAS '10, 2010
The sudden fall of blood pressure (hypotension) is a common complication in medical care. In crit... more The sudden fall of blood pressure (hypotension) is a common complication in medical care. In critical care patients, hypotension (HT) may cause serious heart, endocrine or neurological disorders, inducing severe or even lethal events. Moreover, recent studies report an increase of mortality in HT prone hemodialysis patients in need of critical care. If HT could be predicted in advance, medical staff could take action to minimize its effects, or even avoid its occurrence. Typically, most medical systems have focused on monitoring and detecting current patient status, rather than determining biosignal trends or predicting a patient's future status. Therefore, predicting HT episodes in advance remains a challenge. Furthermore, since critical care actions such as hemodialysis are oftenly inconvenient and uncomfortable procedures, HT prediction or detection methods should be non-invasive, whenever possible. In this paper, we present a solution for continuous monitorization and prediction of HT episodes, using heart rate (HR) and mean blood pressure (BP) non-invasive measured biosignals. We propose an architecture for a HT Predictor (HTP) Tool, presenting a set of tools and a real-time database capable of continuously storing and real-time monitoring all patient's historical HR and BP biosignal data, and efficiently alerting both probable and detected occurrences of HT episodes for each patient for the following 60 minutes. Additionally, the system promotes medical staff mobility, by taking advantage of using mobile personal devices such as mobile phones and PDA's, optimizing human resources. Finally, an experimental evaluation on real-life data from the well known Physionet database shows the efficiency of the tool, outperforming the winning proposal of the Physionet 2009 Challenge.
Building Sustainable Information Systems, 2013
Data Warehouses (DWs) are used for producing business knowledge and aiding decision support. Sinc... more Data Warehouses (DWs) are used for producing business knowledge and aiding decision support. Since they store the secrets of the business, securing their data is critical. To accomplish this, several Database Intrusion Detection Systems (DIDS) have been proposed. However, when using DIDS in DWs, most solutions either produce too many false positives (i.e. false alarms) that must be verified or too many false negatives (i.e. true intrusions that pass undetected). Moreover, many approaches detect intrusions a posteriori which, given the sensitivity of DW data, may result in irreparable cost. To the best of our knowledge, no DIDS specifically tailored for DWs has been proposed. This paper examines intrusion detection from a data warehousing perspective and the reasons why traditional database security methods are not sufficient to avoid intrusions. We define the specific requirements for a DW DIDS and propose a conceptual approach for a real-time DIDS for DWs at the SQL command level that works transparently as an extension of the DataBase Management System (DBMS) between the user applications and the database server itself. A preliminary experimental evaluation using the TPC-H decision support benchmark is included to demonstrate the DIDS' efficiency.
ACM SIGMOD Record, 2014
Databases often support enterprise business and store its secrets. This means that securing them ... more Databases often support enterprise business and store its secrets. This means that securing them from data damage and information leakage is critical. In order to deal with intrusions against database systems, Database Intrusion Detection Systems (DIDS) are frequently used. This paper presents a survey on the main database intrusion detection techniques currently available and discusses the issues concerning their application at the database server layer. The identified weak spots show that most DIDS inadequately deal with many characteristics of specific database systems, such as ad hoc workloads and alert management issues in data warehousing environments, for example. Based on this analysis, research challenges are presented, and requirements and guidelines for the design of new or improved DIDS are proposed. The main finding is that the development and benchmarking of specifically tailored DIDS for the context in which they operate is a relevant issue, and remains a challenge. W...
Proceedings of the 15th Symposium on International Database Engineering & Applications - IDEAS '11, 2011
Data Warehouses (DWs) are the enterprise's most valuable asset in what concerns critical busi... more Data Warehouses (DWs) are the enterprise's most valuable asset in what concerns critical business information, making them an appealing target for attackers. Packaged database encryption solutions are considered the best solution to protect sensitive data. However, given the volume of data typically processed by DW queries, the existing encryption solutions heavily increase storage space and introduce very large overheads in
Lecture Notes in Computer Science, 2013
The development of nuclear power plant simulators is a hard task where many different factors hav... more The development of nuclear power plant simulators is a hard task where many different factors have to be considered. Software components can be a main factor in the improvement of the development cycle of complex simulators. This paper presents a new generation of simulators based on two different component technologies. On the one hand, RT-CORBA is a communication middleware, which allows the development of predictable real-time applications taking benefit from CORBA features. On the other hand, CCA is a new component model for scientific components, which allows high performance computing needed for these simulators.
2012 IEEE 36th Annual Computer Software and Applications Conference, 2012
Real-time Data Warehouses (DWs) must be able to deal with continuous updates while ensuring 24/7 ... more Real-time Data Warehouses (DWs) must be able to deal with continuous updates while ensuring 24/7 availability. To improve their performance, distributing data using round-robin algorithms on clusters of shared-nothing machines is normally used. This paper proposes a solution for distributed DW databases that ensures its continuous availability and deals with frequent data loading requirements, while adding small performance overhead. We use a data striping and replication architecture to distribute portions of each fact table among pairs of slave nodes, where each slave node is an exact replica of its partner. This allows balancing query execution and replacing any defective node, ensuring the system's continuous availability. The size of each portion in a given node depends on its individual features, namely performance benchmark measures and dedicated database RAM. The estimated cost for executing each query workload in each slave node is also used for balancing query performance. We include experiments using the TPC-H decision support benchmark to evaluate the scalability of the proposed solution and show that it outperforms standard round-robin distributed DW setups.
2011 IEEE 35th Annual Computer Software and Applications Conference, 2011
Technological evolution has redefined many business models. Many decision makers are now required... more Technological evolution has redefined many business models. Many decision makers are now required to act near real-time, instead of periodically, given the latest transactional information. Decision-making occurs much more frequently and considers the latest business data. Since data warehouses (DWs) are the core of business intelligence, decision support systems need to deal with 24/7 real-time requirements. Thus, the ability to
Proceedings of the 2008 international symposium on Database engineering & applications - IDEAS '08, 2008
ABSTRACT A data warehouse provides information for analytical processing, decision making and dat... more ABSTRACT A data warehouse provides information for analytical processing, decision making and data mining tools. As the concept of real-time enterprise evolves, the synchronism between transactional data and data warehouses, statically implemented, has been redefined. ...
International Conference on Enterprise Information Systems, 2009
Performance optimization of decision support queries has always been a major issue in data wareho... more Performance optimization of decision support queries has always been a major issue in data warehousing. A large amount of wide-ranging techniques have been used in research to overcome this problem. Bit-based techniques such as bitmap indexes and bitmap join indexes have been used and are generally accepted as standard common practice for optimizing data warehouses. These techniques are very promising due to their relatively low overhead and fast bitwise operations. In this paper, we propose a new technique which performs optimized row selection for decision support queries, introducing a bit-based attribute into the fact table. This attribute's value for each row is set according to its relevance for processing each decision support query by using bitwise operations. Simply inserting a new column in the fact table's structure and using bitwise operations for performing row selection makes it a simple and practical technique, which is easy to implement in any Database Management System. The experimental results, using benchmark TPC-H, demonstrates that it is an efficient optimization method which significantly improves query performance.
EUROCON 2011 - International Conference on Computer as a Tool - Joint with Conftele 2011, 2011
The sudden fall of blood pressure (hypotension - HT) is a common complication in medical care. In... more The sudden fall of blood pressure (hypotension - HT) is a common complication in medical care. In critical care patients, HT may cause serious neurological, heart, or endocrine disorders, inducing severe or even lethal events. Recent studies report an increase of mortality in HT prone hemodialysis patients in need of critical care. Predicting HT episodes in advance is crucial to enable medical staff to minimize its effects or even avoid its occurrence. Most medical systems have focused on monitoring and detecting current patient status, rather than determining biosignal trends or predicting the patient's future status. Therefore, predicting HT episodes in advance remains a challenge. In this paper, we present a solution for continuous monitoring and efficient prediction of HT episodes. We propose an architecture for a HT Predictor (HTP) Tool, capable of continuously storing and real-time monitoring all patient's heart rate and blood pressure biosignal data, alerting probable occurrences of each patient's HT episodes for the following 60 minutes, based on non-invasive hemodynamic variables. Our system also promotes medical staff mobility, taking advantage of using mobile personal devices such as cell phones and PDA's. An experimental evaluation on real-life data from the well-known Physionet database shows the tool's efficiency, outperforming the winning proposal of the Physionet 2009 Challenge.
EUROCON 2011 - International Conference on Computer as a Tool - Joint with Conftele 2011, 2011
Data Warehouses (DWs) are the enterprise's most valuable assets in what concerns critical busines... more Data Warehouses (DWs) are the enterprise's most valuable assets in what concerns critical business information, making them an appealing target for malicious inside and outside attackers. Given the volume of data and the nature of DW queries, most of the existing data security solutions for databases are inefficient, consuming too many resources and introducing too much overhead in query response time, or resulting in too many false positive alarms (i.e., incorrect detection of attacks) to be checked. In this paper, we present a survey on currently available data security techniques, focusing on specific issues and requirements concerning their use in data warehousing environments. We also point out challenges and opportunities for future research work in this field.
2006 10th International Database Engineering and Applications Symposium (IDEAS'06), 2006
Diseases such as avian influenza, severe acute respiratory syndrome (SARS) and Creutzfeldt-Jacob ... more Diseases such as avian influenza, severe acute respiratory syndrome (SARS) and Creutzfeldt-Jacob syndrome represent a new era of biological threats. Nowadays, these hazards breed, mutate and evolve at tremendous speed. Furthermore, they may spread out at the same speed as which we travel. This reveals an urgent need for an agent capable of dealing with such threats. Data warehouses are databases which provide decision support by on-line analytical processing (OLAP) techniques. We present the architecture for an effective information system infrastructure enabling the prediction and near real-time detection of disease outbreaks, using knowledge extraction algorithms to explore a symptoms/diseases data warehouse in a continuous and active form. To collect such data, we take advantage of the Internet and features existing in today's common communication devices such as personal computers, portable digital assistants and cellular phones. We present a case-simulation based on a small country, showing the system can detect an outbreak within hours or even minutes after its physical occurrence, alerting health decision makers and providing quick interaction and feedback between all users. The architecture is also functionally independent from its geographical dimension.
Handbook of Research on Computational Intelligence for Engineering, Science, and Business, 2013
This chapter presents the fundamentals of a hardware based memory network that can perform comple... more This chapter presents the fundamentals of a hardware based memory network that can perform complex cognitive tasks. The network is designed to provide space dimensionality reduction, which enables desired functionality in a random environment. Complex network functionality is achieved by simple network cells that minimize the needed chip area for hardware implementation. Functionality of this network is demonstrated by automatic character recognition with various input deformations. In the character recognition, the network is trained to recognize characters deformed by random noise, rotation, scaling, and shifting. This example demonstrates how cognitive functionality of a hardware network can be achieved through an evolutionary process, as distinct from design based on mathematical formalism.
Lecture Notes in Computer Science, 2009
On-line analytical processing against data warehouse databases is a common form of getting decisi... more On-line analytical processing against data warehouse databases is a common form of getting decision making information for almost every business field. Decision support information oftenly concerns periodic values based on regular attributes, such as sales amounts, percentages, most transactioned items, etc. This means that many similar OLAP instructions are periodically repeated, and simultaneously, between the several decision makers. Our Query Cache Tool takes advantage of previously executed queries, storing their results and the current state of the data which was accessed. Future queries only need to execute against the new data, inserted since the queries were last executed, and join these results with the previous ones. This makes query execution much faster, because we only need to process the most recent data. Our tool also minimizes the execution time and resource consumption for similar queries simultaneously executed by different users, putting the most recent ones on hold until the first finish and returns the results for all of them. The stored query results are held until they are considered outdated, then automatically erased. We present an experimental evaluation of our tool using a data warehouse based on a real-world business dataset and use a set of typical decision support queries to discuss the results, showing a very high gain in query execution time.
2011IEEE 10th International Conference on Trust, Security and Privacy in Computing and Communications, 2011
Data Warehouses (DWs) store the golden nuggets of the business, which makes them an appealing tar... more Data Warehouses (DWs) store the golden nuggets of the business, which makes them an appealing target. To ensure data privacy, encryption solutions have been used and proven efficient in their security purpose. However, they introduce massive storage space and performance overheads, making them unfeasible for DWs. We propose a data masking technique for protecting sensitive business data in DWs that balances security strength with database performance, using a formula based on the mathematical modular operator. Our solution manages apparent randomness and distribution of the masked values, while introducing small storage space and query execution time overheads. It also enables a false data injection method for misleading attackers and increasing the overall security strength. It can be easily implemented in any DataBase Management System (DBMS) and transparently used, without changes to application source code. Experimental evaluations using a real-world DW and TPC-H decision support benchmark implemented in leading commercial DBMS Oracle 11g and Microsoft SQL Server 2008 demonstrate its overall effectiveness. Results show substantial savings of its implementation costs when compared with state of the art data privacy solutions provided by those DBMS and that it outperforms those solutions in both data querying and insertion of new data.
Proceedings of the 2009 International Database Engineering & Applications Symposium on - IDEAS '09, 2009
The purpose of a data warehouse is to aid decision making. As the real-time enterprise evolves, s... more The purpose of a data warehouse is to aid decision making. As the real-time enterprise evolves, synchronism between transactional data and data warehouses is redefined. To cope with realtime requirements, the data warehouses must be able to enable continuous data integration, in order to deal with the most recent business data. Traditional data warehouses are unable to support any dynamics in structure and content while they are available for OLAP. Their data is periodically updated because they are unprepared for continuous data integration. For real-time enterprises with needs in decision support while the transactions are occurring, (near) real-time data warehousing seem very promising. In this paper we present a survey on testing today's most used loading techniques and analyze which are the best data loading methods, presenting a methodology for efficiently supporting continuous data integration for data warehouses. To accomplish this, we use techniques such as table structure replication with minimum content and query predicate restrictions for selecting data, to enable loading data in the data warehouse continuously, with minimum impact in query execution time. We demonstrate the efficiency of the method using benchmark TPC-H and executing query workloads while simultaneously performing continuous data integration.
Lecture Notes in Computer Science, 2012
Data Warehouses (DWs) are the core of enterprise sensitive data, which makes protecting confident... more Data Warehouses (DWs) are the core of enterprise sensitive data, which makes protecting confidentiality in DWs a critical task. Published research and best practice guides state that encryption is the best way to achieve this and maintain high performance. However, although encryption algorithms strongly fulfill their security purpose, we demonstrate that they introduce massive storage space and response time overheads, which mostly result in unacceptable security-performance tradeoffs, compromising their feasibility in DW environments. In this paper, we enumerate state-of-the-art data masking and encryption solutions and discuss the issues involving their use from a data warehousing perspective. Experimental evaluations using the TPC-H decision support benchmark and a real-world sales DW support our remarks, implemented in Oracle 11g and Microsoft SQL Server 2008. We conclude that the development of alternate solutions specifically tailored for DWs that are able to balance security with performance still remains a challenge and an open research issue.
Proceedings of the Fourteenth International Database Engineering & Applications Symposium on - IDEAS '10, 2010
The sudden fall of blood pressure (hypotension) is a common complication in medical care. In crit... more The sudden fall of blood pressure (hypotension) is a common complication in medical care. In critical care patients, hypotension (HT) may cause serious heart, endocrine or neurological disorders, inducing severe or even lethal events. Moreover, recent studies report an increase of mortality in HT prone hemodialysis patients in need of critical care. If HT could be predicted in advance, medical staff could take action to minimize its effects, or even avoid its occurrence. Typically, most medical systems have focused on monitoring and detecting current patient status, rather than determining biosignal trends or predicting a patient's future status. Therefore, predicting HT episodes in advance remains a challenge. Furthermore, since critical care actions such as hemodialysis are oftenly inconvenient and uncomfortable procedures, HT prediction or detection methods should be non-invasive, whenever possible. In this paper, we present a solution for continuous monitorization and prediction of HT episodes, using heart rate (HR) and mean blood pressure (BP) non-invasive measured biosignals. We propose an architecture for a HT Predictor (HTP) Tool, presenting a set of tools and a real-time database capable of continuously storing and real-time monitoring all patient's historical HR and BP biosignal data, and efficiently alerting both probable and detected occurrences of HT episodes for each patient for the following 60 minutes. Additionally, the system promotes medical staff mobility, by taking advantage of using mobile personal devices such as mobile phones and PDA's, optimizing human resources. Finally, an experimental evaluation on real-life data from the well known Physionet database shows the efficiency of the tool, outperforming the winning proposal of the Physionet 2009 Challenge.
Building Sustainable Information Systems, 2013
Data Warehouses (DWs) are used for producing business knowledge and aiding decision support. Sinc... more Data Warehouses (DWs) are used for producing business knowledge and aiding decision support. Since they store the secrets of the business, securing their data is critical. To accomplish this, several Database Intrusion Detection Systems (DIDS) have been proposed. However, when using DIDS in DWs, most solutions either produce too many false positives (i.e. false alarms) that must be verified or too many false negatives (i.e. true intrusions that pass undetected). Moreover, many approaches detect intrusions a posteriori which, given the sensitivity of DW data, may result in irreparable cost. To the best of our knowledge, no DIDS specifically tailored for DWs has been proposed. This paper examines intrusion detection from a data warehousing perspective and the reasons why traditional database security methods are not sufficient to avoid intrusions. We define the specific requirements for a DW DIDS and propose a conceptual approach for a real-time DIDS for DWs at the SQL command level that works transparently as an extension of the DataBase Management System (DBMS) between the user applications and the database server itself. A preliminary experimental evaluation using the TPC-H decision support benchmark is included to demonstrate the DIDS' efficiency.