Evaluation of P2P resource discovery architectures using real-life multi-attribute resource and query characteristics (original) (raw)

Characteristics of Multi-Attribute Resources/Queries and Implications on P2P Resource Discovery

2011

Though resource discovery is a fundamental requirement in collaborative peer-to-peer, grid, and cloud computing, very little is known about resource/query characteristics and their impact on resource discovery. Fundamental design choices for distributed resource advertising and querying are evaluated in the context of existing practical systems. First, a generic model for cost of resource discovery is presented. Second, multi-attribute resource and query characteristics from Planet-Lab and SETI@home are presented. We observe that attributes of both resources and queries are highly skewed, correlated, queries are less specific, and Generalized Pareto distribution is suitable for capturing the distribution of most dynamic attributes and their rate of change. Based on these observations, different design choices are evaluated for resource discovery in terms of their cost of advertising/querying, latency, load balancing, and routing table size. The findings indicate that superpeer-based architectures have the potential to support large-scale resource aggregation as they simultaneously balance the cost and load.

Resource and Query Aware, Multi-Attribute Resource Discovery for P2P Systems

Distributed, multi-attribute Resource Discovery (RD) is a fundamental requirement in collaborative Peer-to-Peer (P2P), grid, and cloud computing. We present an efficient and load balanced, P2P-based multi-attribute RD solution that consists of five heuristics, which can be executed independently and distributedly. First heuristic maintains a minimum number of nodes in a ringlike overlay while pruning nodes that do not significantly contribute to the range query resolution. Removing nonproductive nodes reduces the cost (e.g., hops and latency) of advertising resources and resolving queries. Second and third heuristics dynamically balance the key and query load distribution by transferring some of the keys to its predecessor/successor and by adding new predecessors/successors to handle transferred keys when existing nodes are insufficient, respectively. Last two heuristics form cliques of nodes (that are placed orthogonal to the overlay ring) to dynamically balance the highly skewed key and query loads. By applying these heuristics in the presented order, a RD solution that better responds to real-world resource and query characteristics is developed. Its efficacy is demonstrated using a simulation-based analysis under a variety of single and multi-attribute resource and query distributions derived from real workloads.

P2P-Based, Multi-Attribute Resource Discovery under Real-World Resources and Queries

ACM Transactions on Internet Technology, 2015

Peer (P2P), grid, and cloud computing rely on Resource Discovery (RD) solutions to aggregate groups of multi-attribute, dynamic, and distributed resources. However, specific characteristics of real-world resources and queries, and their impact on P2P-based RD are largely unknown. We analyze the characteristics of resources and queries using data from four real-world systems. Those characteristics are then used to qualitatively and quantitatively evaluate the fundamental design choices for P2P-based multi-attribute RD. The datasets exhibit several noteworthy features that affect the performance. For example, compared to uniform queries, real-world queries are relatively easier to resolve using unstructured, superpeer, and single-attribute-dominated-query-based structured P2P solutions, as queries mostly specify only a small subset of the available attributes and large ranges of attribute values. However, all the solutions are prone to significant load balancing issues, as the resources and queries are highly skewed and correlated. The implications of our findings for improving RD solutions are also discussed.

GChord: indexing for multi-attribute query in P2P system with low maintenance cost

2007

To provide complex query processing in peer-to-peer systems has attracted much attention in both academic and industrial community. We present GChord, a scalable technique for evaluating queries with multi-attributes. Both exact match and range queries can be handled by GChord. It has advantages over existing methods in that each tuple only needs to be indexed once, while the query efficiency is guaranteed. Thus, index maintenance cost and search efficiency are balanced.

Enabling flexible queries with guarantees in P2P systems

2004

The Squid peer-to-peer information discovery system supports flexible queries using partial keywords, wildcards, and ranges. It is built on a structured overlay and uses data lookup protocols to guarantee that all existing data elements that match a query are found efficiently. Its main innovation is a dimension-reducing indexing scheme that effectively maps multidimensional information space to physical peers.

Survey of Various Search Mechanisms in Unstructured Peer-to-Peer Networks

International Journal of Computer Applications, 2013

Peer-to-Peer (P2P) [1] are widely used for file sharing purposes. This type of usage provides decentralized solutions over centralized complex architecture. Peer-to-Peer networks are gaining attention from both the scientific perspective as well as the large Internet community. Popular applications utilizing this new technology offer many attractive features to a growing number of users. P2P is an architecture which is all-together a different class of applications that use the concept of distributed resources to perform an important crucial function in a decentralized manner. The popularity and bandwidth consumption attributed to current Peer-to-Peer filesharing applications makes the operation of these distributed systems very important for the Internet community. Efficiently discovering the queried resource is the initial and most important step in establishing an efficient peer-to-peer communication. Here, we will be describing and analyzing the performances of some existing search mechanisms deployed for the peer discovery and the content look up.

Dynamic querying in structured peer-to-peer networks

2008

Dynamic Querying (DQ) is a technique adopted in unstructured Peer-to-Peer (P2P) networks to minimize the number of peers that is necessary to visit to reach the desired number of results. In this paper we introduce the use of the DQ technique in structured P2P networks. In particular, we present a P2P search algorithm, named DQ-DHT (Dynamic Querying over a Distributed Hash Table), to perform DQ-like searches over DHT-based overlays.

Comparative Study of Peer-to-Peer Architectures for Scalable Resource Discovery

2009 First International Conference on Advances in P2P Systems, 2009

Resource discovery is an important aspect of many modern large-scale distributed systems. In the past, this problem has been solved using many different approaches, such as a central registry server, flooding-based protocols, and distributed hash tables. In this paper, these three widely used architectures are compared, using measurement results obtained from real implementations run on an Emulab emulation environment. This allows us to study the advantages and disadvantages of the architectures and determine their usefulness.

Query Routing and Processing in Peer-To-Peer Data Sharing Systems

2010

Sharing musical files via the Internet was the essential motivation of early P2P systems. Despite of the great success of the P2P file sharing systems, these systems support only "simple" queries. The focus in such systems is how to carry out an efficient query routing in order to find the nodes storing a desired file. Recently, several research works have been made to extend P2P systems to be able to share data having a fine granularity (i.e. atomic attribute) and to process queries written with a highly expressive language (i.e. SQL). These works have led to the emergence of P2P data sharing systems that represent a new generation of P2P systems and, on the other hand, a next stage in a long period of the database research area. ? The characteristics of P2P systems (e.g. large-scale, node autonomy and instability) make impractical to have a global catalog that represents often an essential component in traditional database systems. Usually, such a catalog stores informat...