Storage allocation in unreliable peer-to-peer systems (original) (raw)
Related papers
Replica Placement Algorithm for Highly Available Peer-to-Peer Storage Systems
2009
Peer-to-peer (P2P) technology is an emerging approach to overcoming the limitations of the traditional client server architecture. However, building a highly available P2P system is quite challenging, in particular a P2P storage system. The reason is due to the fundamental nature of P2P systems: peers can join and leave at any time without any notice. Replication is one of the strategies in overcoming the unpredictable behavior of peers. A good replication algorithm should use the minimum number of replicas to provide the desired availability of data. The popular approach in the previous studies is a random placement of replicas, but it ignores the wide difference in the availability of each peer. In this paper, we develop a replica placement algorithm which exploits the availability pattern of each individual peer. By comparing our algorithm with a random placement scheme, we show that our algorithm dramatically improves the data availability with moderate overhead in terms of memory consumption and processing time in both ideal and practical conditions.
Redundancy Management for P2P Storage
Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid '07), 2007
P2P storage systems must protect data against temporary unavailability and the effects of churn in order to become platforms for safe storage. This paper evaluates and compares redundancy techniques for P2P storage according to availability, accessibility, and maintainability using traces and theoretical results.
Replica placement algorithm based on peer availability for p2p storage systems
Peer-to-peer (P2P) technology is an emerging approach to overcoming the limitations of the traditional client server architecture. However, building a highly available P2P system is quite challenging, in particular a P2P storage system. The reason is due to the fundamental nature of P2P systems: peers can join and leave at any time without any notice. Replication is one of the strategies in overcoming the unpredictable behavior of peers. A good replication algorithm should use the minimum number of replicas to provide the desired availability of data. The popular approach in the previous studies is a random placement of replicas, but it ignores the wide difference in the availability of each peer. In this paper, we propose PAT (Peer Availability in order to analyze and predict the state of nodes and develop a replica placement algorithm, which exploits the availability pattern of each individual peer. By comparing our algorithm with a random placement scheme, we show that our algorithm dramatically improves the data availability with moderate overhead in terms of memory consumption and processing time in both ideal and practical conditions. Additionally, we demonstrate the application of PAT as an analysis tool for various P2P systems.
On the interplay between data redundancy and retrieval times in P2P storage systems
Computer Networks, 2014
Peer-to-peer (P2P) storage systems aggregate spare storage resources from end users to build a large collaborative online storage solution. In these systems, however, the high levels of user churn-peers failing or leaving temporarily or permanently-affect the quality of the storage service and might put data reliability on risk. Indeed, one of the main challenge of P2P storage systems has traditionally been how to guarantee that stored data can always be retrieved within some time frame. To meet this challenge, existing systems store objects with high amounts of data redundancy, rendering data availability values close to 100%, which in turn ensure optimal retrieval times (only constrained by network limits). Unfortunately, this redundancy reduces the overall net capacity of the system and increases data maintenance costs. To alleviate these problems data redundancy can be reduced at the expense of lengthening retrieval times. The problem is that both the rewards and disadvantages of doing so are not well understood. In this paper we present a novel analytical framework that allows us to model retrieval times in P2P storage systems and describe the interplay between data redundancy and retrieval times for different churn patterns. Using availability traces from real P2P applications, we show that our framework provides accurate estimation of retrieval times in realistic environments.
Availability in global peer-to-peer storage systems
2004
Peer-to-peer file sharing applications have become increasingly popular. Measurements of P2P systems indicate large heterogeneity in the availability of individual nodes. Many have cyclic behavior, whereas others are always available. This paper proposes a cooperative storage technique which employs erasure coding schemes on a collection of data objects and provides various levels of data redundancy. Based on this technique, we study a historybased hill climbing scheme that takes advantage of varied time zones in a global p2p system. Our simulation results show the improved data availability by this scheme. We also investigate several climbing strategies including choice of coding schemes and laziness of data movement.
An Adaptive Replication Algorithm in P2P File Systems with Unreliable Nodes
Pdpta, 2009
The paper 1 focuses on distributed file systems in P2P networks. We introduce a novel file replication scheme which is adaptive, reacting to changes in the patterns of access to the file system by dynamically creating or deleting replicas. Replication is used to increase data availability in the presence of site or communication failures and to decrease retrieval costs by local access if possible. Our system is completely decentralized and nodes can be removed/added dynamically. We also propose an overlay architecture for file search. This architecture is structured, but also based on random walk. Our system has a mobile agent which performs dynamic load-balancing. This agent is event driven and circulates in the network to find and "destroy" the least important files and thus limit the proliferation of superfluous replicas. We have implemented our method at TCP/IP sockets level.
High Availability, Scalable Storage, Dynamic Peer Networks: Pick Two
2003
Peer-to-peer storage aims to build large-scale, reliable and available storage from many small-scale unreliable, low-availability distributed hosts. Data redundancy is the key to any data guarantees. However, preserving redundancy in the face of highly dynamic membership is costly. We use a simple resource usage model to measured behavior from the Gnutella file-sharing network to argue that large-scale cooperative storage is limited by likely dynamics and cross-system bandwidth -not by local disk space. We examine some bandwidth optimization strategies like delayed response to failures, admission control, and load-shifting and find that they do not alter the basic problem. We conclude that when redundancy, data scale, and dynamics are all high, the needed cross-system bandwidth is unreasonable.