Philip Shao - Academia.edu (original) (raw)
Uploads
Papers by Philip Shao
ACM Transactions on Database Systems, May 1, 2014
In database systems where data resides in both RAM and disk, it is imperative that as few data re... more In database systems where data resides in both RAM and disk, it is imperative that as few data requests as possible are made to disk, since the speed of disk access is orders of magnitudes slower than that for RAM. In typical mixed storage database systems, since the final state of the database after a set of concurrent transactions is guaranteed only to be equivalent to the state that would have been reached if the transactions had executed in some serial order, it is often profitable to reschedule transactions touching data not pegged in RAM until the data is available after it has been read from disk. This work considers the issue of data availability in the context of a deterministic database system where the final state of the database is equivalent to the state that would have been reached if the transactions had executed in a specific, predetermined serial order. In particular, this paper discusses the constraints imposed by the inability to arbitrarily reorder concurrent tra...
Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data, 2012
ACM Transactions on Database Systems, 2014
As more data management software is designed for deployment in public and private clouds, or on a... more As more data management software is designed for deployment in public and private clouds, or on a cluster of commodity servers, new distributed storage systems increasingly achieve high data access throughput via partitioning and replication. In order to achieve high scalability, however, today's systems generally reduce transactional support, disallowing single transactions from spanning multiple partitions. This article describes Calvin, a practical transaction scheduling and data replication layer that uses a deterministic ordering guarantee to significantly reduce the normally prohibitive contention costs associated with distributed transactions. This allows near-linear scalability on a cluster of commodity machines, without eliminating traditional transactional guarantees, introducing a single point of failure, or requiring application developers to reason about data partitioning. By replicating transaction inputs instead of transactional actions, Calvin is able to support ...
ACM Transactions on Database Systems, May 1, 2014
In database systems where data resides in both RAM and disk, it is imperative that as few data re... more In database systems where data resides in both RAM and disk, it is imperative that as few data requests as possible are made to disk, since the speed of disk access is orders of magnitudes slower than that for RAM. In typical mixed storage database systems, since the final state of the database after a set of concurrent transactions is guaranteed only to be equivalent to the state that would have been reached if the transactions had executed in some serial order, it is often profitable to reschedule transactions touching data not pegged in RAM until the data is available after it has been read from disk. This work considers the issue of data availability in the context of a deterministic database system where the final state of the database is equivalent to the state that would have been reached if the transactions had executed in a specific, predetermined serial order. In particular, this paper discusses the constraints imposed by the inability to arbitrarily reorder concurrent tra...
Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data, 2012
ACM Transactions on Database Systems, 2014
As more data management software is designed for deployment in public and private clouds, or on a... more As more data management software is designed for deployment in public and private clouds, or on a cluster of commodity servers, new distributed storage systems increasingly achieve high data access throughput via partitioning and replication. In order to achieve high scalability, however, today's systems generally reduce transactional support, disallowing single transactions from spanning multiple partitions. This article describes Calvin, a practical transaction scheduling and data replication layer that uses a deterministic ordering guarantee to significantly reduce the normally prohibitive contention costs associated with distributed transactions. This allows near-linear scalability on a cluster of commodity machines, without eliminating traditional transactional guarantees, introducing a single point of failure, or requiring application developers to reason about data partitioning. By replicating transaction inputs instead of transactional actions, Calvin is able to support ...