On Optimal Replication Group Splits in P2P Data Stores Based on the Hypercube (original) (raw)

Policies for Efficient Data Replication in P2P Systems

2013 International Conference on Parallel and Distributed Systems, 2013

This paper addresses the problem of maintaining replicated data in large scale P2P systems. Although this topic has been extensively studied in the literature, to maintain replicated data in this setting, in an efficient manner, still remains a significant challenge. This paper proposes novel policies to address this problem and evaluates their performance against different criteria, such as monitoring costs, data transfer costs, and load unbalance costs. We show that one of these new policies significantly outperforms previous work. Interestingly, this policy is based on a somehow counter-intuitive approach, that uses less reliable nodes to store the most accessed data items. The insights to derive this policy were obtained from an in depth analysis of existing solutions, that is also captured in the paper.

An Efficient Replicated Data Management Approach for Peer-to-Peer Systems

Lecture Notes in Computer Science, 2005

The availability of critical services and their data can be significantly increased by replicating them on multiple systems connected with each other, even in the face of system and network failures. In some platforms such as peerto-peer (P2P) systems, their inherent characteristic mandates the employment of some form of replication to provide acceptable service to their users. However, the problem of how best to replicate data to build highly available peer-to-peer systems is still an open problem. In this paper, we propose an approach to address the data replication problem on P2P systems. The proposed scheme is compared with other techniques and is shown to require less communication cost for an operation as well as provide higher degree of data availability.

Performance of Lookup Operations in a Hypercube-based P2P Data Store: Theoretical Model and Performance Evaluation

2005

One way for Peer-to-Peer data stores to achieve high data availability is to replicate their data. This is necessary to counter the effects of peer population dynamics also known as churn. A consequence of churn is that locating a data item may require a peer to resend search messages thus introducing additional communication. A formal model of this communication pertaining to data item lookups is introduced and evaluated using simulation in this paper. Results hold true for hypercube-based P2P data stores.

Replica Placement Algorithm for Highly Available Peer-to-Peer Storage Systems

2009

Peer-to-peer (P2P) technology is an emerging approach to overcoming the limitations of the traditional client server architecture. However, building a highly available P2P system is quite challenging, in particular a P2P storage system. The reason is due to the fundamental nature of P2P systems: peers can join and leave at any time without any notice. Replication is one of the strategies in overcoming the unpredictable behavior of peers. A good replication algorithm should use the minimum number of replicas to provide the desired availability of data. The popular approach in the previous studies is a random placement of replicas, but it ignores the wide difference in the availability of each peer. In this paper, we develop a replica placement algorithm which exploits the availability pattern of each individual peer. By comparing our algorithm with a random placement scheme, we show that our algorithm dramatically improves the data availability with moderate overhead in terms of memory consumption and processing time in both ideal and practical conditions.

Storage allocation in unreliable peer-to-peer systems

Proceedings of the International Conference on Dependable Systems and Networks, 2006

Peer-to-peer systems provide the opportunity to pool large amounts of distributed resources to enable internetscale applications. However, the participant nodes are highly dynamic and unreliable. Thus, any shared resource such as file objects must incorporate redundancy to be useful. While many studies have proposed heuristics to determine redundancy levels based on object popularity, there has been little work in determining optimal or near-optimal resource allocation based on node reliability. In this paper, we present a strategy for the allocation of objects in the presence of dynamic and unreliable peers. We have built an availability model of peer-to-peer storage systems based on the bimodal and time-dependent availability characteristics of a P2P node. Using this model, we can select the size of a candidate node set for storage allocation and assign storage objects to maximize availability while still maintaining a balanced distribution of objects.

Availability in global peer-to-peer storage systems

2004

Peer-to-peer file sharing applications have become increasingly popular. Measurements of P2P systems indicate large heterogeneity in the availability of individual nodes. Many have cyclic behavior, whereas others are always available. This paper proposes a cooperative storage technique which employs erasure coding schemes on a collection of data objects and provides various levels of data redundancy. Based on this technique, we study a historybased hill climbing scheme that takes advantage of varied time zones in a global p2p system. Our simulation results show the improved data availability by this scheme. We also investigate several climbing strategies including choice of coding schemes and laziness of data movement.

Performance of Lookup Operations in a Hypercube-based P2P

2005

One way for Peer-to-Peer data stores to achieve high data availability is to replicate their data. This is necessary to counter the effects of peer population dynamics also known as churn. A consequence of churn is that locating a data item may require a peer to resend search messages thus introducing additional communication. A formal model of this communication pertaining to data item lookups is introduced and evaluated using simulation in this paper. Results hold true for hypercube-based P2P data stores.

Redundancy Management for P2P Storage

Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid '07), 2007

P2P storage systems must protect data against temporary unavailability and the effects of churn in order to become platforms for safe storage. This paper evaluates and compares redundancy techniques for P2P storage according to availability, accessibility, and maintainability using traces and theoretical results.