Self-Stabilizing Balancing Algorithm for Containment-Based Trees (original) (raw)

Enhancing Fault Tolerance of Distributed R-Tree

2011 5th Latin-American Symposium on Dependable Computing, 2011

Distributed R-trees (DR-trees) are appealing infrastructures for implementing range queries, content based filtering or k-NN structures since they inherit the features of R-trees such as logarithmic height, bounded number of neighbors and balanced shape. However, they are crash-sensitivite since each single crash can potentially break the tree structure connectivity. In this article, we present a fault tolerant approach which exploits replication of non leaf nodes ensuring the tree connectivity in presence of crashes. Our contribution is twofold. First, we enhance the connectivity without modifying the R-tree structure. Second, via extensive simulations we prove that our approach drastically reduces the cost of both message traffic and stabilization time when compared to the original approach proposed in [3] where all nodes of a disconnected subtree are reinserted. Finally, our approach can be easily extended to other crashsensitive structures.

A practical scalable distributed B-tree

Proceedings of the VLDB Endowment, 2008

Internet applications increasingly rely on scalable data structures that must support high throughput and store huge amounts of data. These data structures can be hard to implement efficiently. Recent proposals have overcome this problem by giving up on generality and implementing specialized interfaces and functionality (e.g., Dynamo [4]). We present the design of a more general and flexible solution: a fault-tolerant and scalable distributed B-tree. In addition to the usual B-tree operations, our B-tree provides some important practical features: transactions for atomically executing several operations in one or more B-trees, online migration of B-tree nodes between servers for load-balancing, and dynamic addition and removal of servers for supporting incremental growth of the system. Our design is conceptually simple. Rather than using complex concurrency and locking protocols, we use distributed transactions to make changes to B-tree nodes. We show how to extend the B-tree and keep additional information so that these transactions execute quickly and efficiently. Our design relies on an underlying distributed data sharing service, Sinfonia [1], which provides fault tolerance and a lightweight distributed atomic primitive. We use this primitive to commit our transactions. We implemented our B-tree and show that it performs comparably to an existing open-source B-tree and that it scales to hundreds of machines. We believe that our approach is general and can be used to implement other distributed data structures easily.

Replication control in distributed B-trees /

B-trees are a commonly used data structure to associate symbols with related information, as in a symbol table or le index. The behavior and performance of B-tree algorithms are well understood for sequential processing and even concurrent processing on small-scale sharedmemory multiprocessors. Few algorithms, however, have been proposed or carefully studied for the implementation of concurrent B-trees on networks of message-passing multicomputers. The distribution of memory across the several processors of such networks creates a challenge for building an e cient B-tree that does not exist when all memory is centralized distributing the pieces of the B-tree data structure. In this work we explore the use and control of replication of parts of a distributed data structure to create e cient distributed B-trees. Prior work has shown that replicating parts of the B-tree structure on more than one processor does increase throughput. But while the one original copy o f e a c h tree node may b e too few, copying the entire B-tree wastes space and requires work to keep the copies consistent. In this work we develop answers to questions not faced by the centralized shared-memory model: which B-tree nodes should be copied, and how many copies of each node should be made. The answer for a particular tree can change over time. We explore the characteristics of optimal replication for a tree given a static pattern of accesses and techniques for dynamically creating near-optimal replication from observed access patterns. Our work makes three signi cant extensions to prior knowledge: It introduces a queueing network model using Mean Value Analysis of distributed B-trees to describe the tradeo between replication and performance. It develops, through analysis and simulation, rules for the use of replication that maximize performance for a xed amount of space, updating the intuitive rules of prior work. It presents a description and analysis of an algorithm for dynamic control of replication in response to changing access patterns.

Abstract unordered and ordered trees CRDT

Trees are fundamental data structure for many areas of computer science and system engineering. In this report, we show how to ensure eventual consistency of optimistically replicated trees. In optimistic replication, the different replicas of a distributed system are allowed to diverge but should eventually reach the same value if no more mutations occur. A new method to ensure eventual consistency is to design Conflict-free Replicated Data Types (CRDT). In this report, we design a collection of tree CRDT using existing set CRDTs. The remaining concurrency problems particular to tree data structure are resolved using one or two layers of correction algorithm. For each of these layer, we propose different and independent policies. Any combination of set CRDT and policies can be constructed, giving to the distributed application programmer the entire control of the behavior of the shared data in face of concurrent mutations. We also propose to order these trees by adding a positionin...

Fault tolerance and storage reduction in binary search trees

Information and Control, 1984

We introduce a method of representing a broad class of binary search trees in an environment in which pointers and other structural information may be "lost" or "maliciously altered." The fault tolerant representation permits any 2 field changes to be detected and any 1.to be corrected without significantly increasing storage requirements of the binary tree. The detection and correction procedures applied to the entire tree require O(n) time. This discipline is also used to represent binary search trees with a single pointer per datum without altering the cost of searching or updating, if applied in conjunction with any underlying tree balancing scheme (bounded balance, etc.). If no balancing scheme is employed, the trees we form will have significantly shorter search paths than those formed using the straightforward algorithm.

Fully Dynamic Balanced and Distributed Search Trees with Logarithmic Costs

1999

Abstract In this paper we consider the dictionary problem in a message passing distributed environment. We introduce a new version of an order-preserving distributed search tree, called BDST for Balanced and Distributed Search Tree, capable to both grow and shrink as long as keys are inserted and deleted. This is the rst distributed data structure to explicitly support both insertion and deletion with logarithmic costs, ie a key can be searched, inserted and deleted in O (logn) messages, where n is the number of servers.

Available and stabilizing 2-3 trees

2000

Transient faults corrupt the content and organization of data structures. A recovery technique dealing with such faults is stabilization, which guarantees, following some number of operations on the data structure, that content of the data structure is legitimate. Another notion of fault tolerance is availability, which is the property that operations continue to be applied during the period of recovery after a fault, and successful updates are not lost while the data structure stabilizes to a legitimate state. The available, stabilizing 2-3 tree supports find, insert, and delete operations, each with O(lg n) complexity when the tree's state is legitimate and contains n items. For an illegitimate state, these operations have O(lg K) complexity where K is the maximum capacity of the tree. Within O(t) operations, the state of the tree is guaranteed to be legitimate, where t is the number of nodes accessible via some path from the tree's root at the initial state. This paper resolves, for the first time, issues of dynamic allocation and pointer organization in a stabilizing data structure.

A Repair Mechanism for Fault-Tolerance for Tree-Structured Peer-to-Peer Systems

Facing the limits of traditional tools of resource management within computational grids (related to scale, dynamicity, etc. of the platforms newly considered), new approaches, based on peer-to-peer technologies are emerging. The resource discovery and in particular the service discovery is concerned by this evolution. Among the solutions, a promising one is the indexing of resources using trie structures and more particularly prefix trees. The major advantages of trie-structured approaches is the capability to support search queries on ranges of values with a latency growing logarithmically in the number of nodes in the trie. Those techniques are easy to extend to multicriteria searches. One drawback of using tries is its inherent poor robustness in a dynamic environment, where nodes join and leave the network, leading to the split of the tree into a forest, which results in the impossibility to route requests. Within most recent approaches, the fault-tolerance is a prevention mechanism, often replication-based. The replication can be costly in term of resources required. In this paper, we propose a fault-tolerance protocol that reconnects subtrees a posteriori, after crashes, to have again a connected graph and then reorder the nodes to rebuild a consistent tree.

SNAP-STABILIZING PREFIX TREE FOR PEER-TO-PEER SYSTEMS

Parallel Processing Letters, 2010

Resource Discovery is a crucial issue in the deployment of computational grids over large scale peer-to-peer platforms. Because they efficiently allow range queries, Tries (a.k.a., Prefix Trees) appear to be among promising ways in the design of distributed data structures indexing resources. Self-stabilization is an efficient approach in the design of reliable solutions for dynamic systems. A snap-stabilizing algorithm guarantees that it always behaves according to its specification. In other words, a snap-stabilizing algorithm is also a self-stabilizing algorithm which stabilizes in 0 steps.