Ricardo Jimenez-Peris - Academia.edu (original) (raw)
Uploads
Papers by Ricardo Jimenez-Peris
The Mathematica Journal
The need for combining different solvers into a system that is able to process constraints that c... more The need for combining different solvers into a system that is able to process constraints that can not be solved by a single solver is widely recognized. Of particular interest is the design of a distributed system based on a flexible architecture that supports an easy integration and cooperation of new constraint solvers. CFLP (Constraint Functional Logic Programming system) is a distributed software system consisting of a functional logic interpreter running on one machine and various specialized constraint solvers that may ...
Network Computing …, Jul 27, 2005
Peer-to-peer systems (P2P) have become a popular technique to architect decentralized systems. Ho... more Peer-to-peer systems (P2P) have become a popular technique to architect decentralized systems. However, despite its popularity most P2P systems consist in simple applications such as file sharing or chat systems. The main reason is that more complex applications require levels of consistency that nowadays are not offered by P2P systems. In this paper, we explore how to provide consistency based on distributed mutual exclusion via quorum systems in DHT-based P2P networks. Our results show that quorum systems ...
Encyclopedia of Database Systems
2016 IEEE International Conference on Big Data (Big Data), 2016
The CloudMdsQL polystore provides integrated access to multiple heterogeneous data stores, such a... more The CloudMdsQL polystore provides integrated access to multiple heterogeneous data stores, such as RDBMS, NoSQL or even HDFS through a big data analytics framework such as MapReduce or Spark. The CloudMdsQL language is a functional SQL-like query language with a flexible nested data model. A major capability is to exploit the full power of each of the underlying data stores by allowing native queries to be expressed as functions and involved in SQL statements. The CloudMdsQL polystore has been validated with a good number of different data stores: HDFS, key-value, document, graph, RDBMS and OLAP engine. In this paper, we introduce the benchmarking of the CloudMdsQL polystore and evaluate the performance benefits of important features enabled by the query language and engine.
Recent proposals have shown that middleware based database replication is able to provide 1-copy-... more Recent proposals have shown that middleware based database replication is able to provide 1-copy-serializability in LAN environments with excellent performance. This can be achieved by using powerful and fast multicast primitives that deliver messages to all sites in the same order in an all-or-nothing fashion. The question is whether a similar approach is feasible in WAN environments considering the increased message latency. Some of the approaches used in LANs might also work in WANs but others will be prohibitive. We have performed extensive tests of different protocols in a WAN testbed, in order to identify the most crucial bottlenecks, and the most promising optimizations. The performance results show that performance remains acceptable even for medium sized systems consisting of up to eight sites. As such, we believe that data replication guaranteeing 1-copy-serializability is a serious alternative to weaker approaches even in WAN environments.
Distributed and Parallel Databases, 2015
The blooming of different cloud data management infrastructures, specialized for different kinds ... more The blooming of different cloud data management infrastructures, specialized for different kinds of data and tasks, has led to a wide diversification of DBMS interfaces and the loss of a common programming paradigm. In this paper, we present the design of a Cloud Multidatastore Query Language (CloudMdsQL), and its query engine. CloudMdsQL is a functional SQL-like language, capable of querying multiple heterogeneous data stores (relational and NoSQL) within a single query that may contain embedded invocations to each data store's native query interface. The query engine has a fully distributed architecture, which provides important opportunities for optimization. The major innovation is that a CloudMdsQL query can exploit the full power of local data stores, by simply allowing some local data store native queries (e.g. a breadth-first search query against a graph database) to be called as functions, and at the same time be optimized, e.g. by pushing down select predicates, using bind join, performing join ordering, or planning intermediate data shipping. Our experimental validation, with three data stores (graph, document and relational) and representative queries, shows that CloudMdsQL satisfies the five important requirements for a cloud multidatastore query language.
Middleware platforms are becoming very popular among system developers. Due to its popularity, th... more Middleware platforms are becoming very popular among system developers. Due to its popularity, there is an increasing demand for dependable middleware support. In the past few years several research efforts have concentrated in augmenting the dependability of middleware infrastructures which have led to the definition of the FT-Corba specification. Active replication is one of the main techniques that have been used to achieve some of the required dependability attributes such as high-availability. This kind of replication requires deterministic replicas to behave as a state machine what has been traditionally achieved by restricting replicas to be single-threaded. Unfortunately, single-threading results too restrictive for middleware servers, especially transactional ones, where it is not admissible to process requests sequentially. In this paper, we show how it is possible to remove this restriction. We present a deterministic scheduling algorithm for multithreaded replicas in a transactional framework. Determinism of multithreaded replicas is achieved with a combination of reliable total order multicast and a deterministic scheduler. The former guarantees that all the replicas see the external events in the same order. The latter, ensures that all threads are scheduled in the same way at all replicas. One of the novelties of the approach is that determinism is achieved without resorting to inter-replica communication. Additionally, the paper also addresses how to perform online recovery while maintaining replica determinism in order to keep a high level of availability.
Dynamically adaptive systems sense their environment and adjust themselves to accommodate to chan... more Dynamically adaptive systems sense their environment and adjust themselves to accommodate to changes in order to maximize performance. Depending on the type of change (e.g., modifications of the load, the type of workload, the available resources, the client distribution, etc.), different adjustments have to be made. Coordinating them is already difficult in a centralized system. Doing so in the currently prevalent component-based distributed systems is even more challenging. In this paper, we present an adaptive distributed middleware for data replication that is able to adjust to changes in the amount of load submitted to the different replicas and to the type of workload submitted. Its novelty lies in combining load-balancing techniques with feedback driven adjustments of multiprogramming levels (number of transactions that are allowed to execute concurrently). An extensive performance analysis shows that the proposed adaptive replication solution can provide high throughput, goo...
The Mathematica Journal
The need for combining different solvers into a system that is able to process constraints that c... more The need for combining different solvers into a system that is able to process constraints that can not be solved by a single solver is widely recognized. Of particular interest is the design of a distributed system based on a flexible architecture that supports an easy integration and cooperation of new constraint solvers. CFLP (Constraint Functional Logic Programming system) is a distributed software system consisting of a functional logic interpreter running on one machine and various specialized constraint solvers that may ...
Network Computing …, Jul 27, 2005
Peer-to-peer systems (P2P) have become a popular technique to architect decentralized systems. Ho... more Peer-to-peer systems (P2P) have become a popular technique to architect decentralized systems. However, despite its popularity most P2P systems consist in simple applications such as file sharing or chat systems. The main reason is that more complex applications require levels of consistency that nowadays are not offered by P2P systems. In this paper, we explore how to provide consistency based on distributed mutual exclusion via quorum systems in DHT-based P2P networks. Our results show that quorum systems ...
Encyclopedia of Database Systems
2016 IEEE International Conference on Big Data (Big Data), 2016
The CloudMdsQL polystore provides integrated access to multiple heterogeneous data stores, such a... more The CloudMdsQL polystore provides integrated access to multiple heterogeneous data stores, such as RDBMS, NoSQL or even HDFS through a big data analytics framework such as MapReduce or Spark. The CloudMdsQL language is a functional SQL-like query language with a flexible nested data model. A major capability is to exploit the full power of each of the underlying data stores by allowing native queries to be expressed as functions and involved in SQL statements. The CloudMdsQL polystore has been validated with a good number of different data stores: HDFS, key-value, document, graph, RDBMS and OLAP engine. In this paper, we introduce the benchmarking of the CloudMdsQL polystore and evaluate the performance benefits of important features enabled by the query language and engine.
Recent proposals have shown that middleware based database replication is able to provide 1-copy-... more Recent proposals have shown that middleware based database replication is able to provide 1-copy-serializability in LAN environments with excellent performance. This can be achieved by using powerful and fast multicast primitives that deliver messages to all sites in the same order in an all-or-nothing fashion. The question is whether a similar approach is feasible in WAN environments considering the increased message latency. Some of the approaches used in LANs might also work in WANs but others will be prohibitive. We have performed extensive tests of different protocols in a WAN testbed, in order to identify the most crucial bottlenecks, and the most promising optimizations. The performance results show that performance remains acceptable even for medium sized systems consisting of up to eight sites. As such, we believe that data replication guaranteeing 1-copy-serializability is a serious alternative to weaker approaches even in WAN environments.
Distributed and Parallel Databases, 2015
The blooming of different cloud data management infrastructures, specialized for different kinds ... more The blooming of different cloud data management infrastructures, specialized for different kinds of data and tasks, has led to a wide diversification of DBMS interfaces and the loss of a common programming paradigm. In this paper, we present the design of a Cloud Multidatastore Query Language (CloudMdsQL), and its query engine. CloudMdsQL is a functional SQL-like language, capable of querying multiple heterogeneous data stores (relational and NoSQL) within a single query that may contain embedded invocations to each data store's native query interface. The query engine has a fully distributed architecture, which provides important opportunities for optimization. The major innovation is that a CloudMdsQL query can exploit the full power of local data stores, by simply allowing some local data store native queries (e.g. a breadth-first search query against a graph database) to be called as functions, and at the same time be optimized, e.g. by pushing down select predicates, using bind join, performing join ordering, or planning intermediate data shipping. Our experimental validation, with three data stores (graph, document and relational) and representative queries, shows that CloudMdsQL satisfies the five important requirements for a cloud multidatastore query language.
Middleware platforms are becoming very popular among system developers. Due to its popularity, th... more Middleware platforms are becoming very popular among system developers. Due to its popularity, there is an increasing demand for dependable middleware support. In the past few years several research efforts have concentrated in augmenting the dependability of middleware infrastructures which have led to the definition of the FT-Corba specification. Active replication is one of the main techniques that have been used to achieve some of the required dependability attributes such as high-availability. This kind of replication requires deterministic replicas to behave as a state machine what has been traditionally achieved by restricting replicas to be single-threaded. Unfortunately, single-threading results too restrictive for middleware servers, especially transactional ones, where it is not admissible to process requests sequentially. In this paper, we show how it is possible to remove this restriction. We present a deterministic scheduling algorithm for multithreaded replicas in a transactional framework. Determinism of multithreaded replicas is achieved with a combination of reliable total order multicast and a deterministic scheduler. The former guarantees that all the replicas see the external events in the same order. The latter, ensures that all threads are scheduled in the same way at all replicas. One of the novelties of the approach is that determinism is achieved without resorting to inter-replica communication. Additionally, the paper also addresses how to perform online recovery while maintaining replica determinism in order to keep a high level of availability.
Dynamically adaptive systems sense their environment and adjust themselves to accommodate to chan... more Dynamically adaptive systems sense their environment and adjust themselves to accommodate to changes in order to maximize performance. Depending on the type of change (e.g., modifications of the load, the type of workload, the available resources, the client distribution, etc.), different adjustments have to be made. Coordinating them is already difficult in a centralized system. Doing so in the currently prevalent component-based distributed systems is even more challenging. In this paper, we present an adaptive distributed middleware for data replication that is able to adjust to changes in the amount of load submitted to the different replicas and to the type of workload submitted. Its novelty lies in combining load-balancing techniques with feedback driven adjustments of multiprogramming levels (number of transactions that are allowed to execute concurrently). An extensive performance analysis shows that the proposed adaptive replication solution can provide high throughput, goo...