Dynamic querying in structured peer-to-peer networks (original) (raw)

A local search mechanism for peer-to-peer networks

Proceedings of the eleventh international conference on Information and knowledge management - CIKM '02, 2002

One important problem in peer-to-peer (P2P) networks is searching and retrieving the correct information. However, existing searching mechanisms in pure peer-to-peer networks are inefficient due to the decentralized nature of such networks. We propose two mechanisms for information retrieval in pure peer-to-peer networks. The first, the modified Breadth-First-Search (BFS) mechanism, is an extension of the current Gnuttela protocol, allows searching with keywords, and is designed to minimize the number of messages that are needed to search the network. The second, the Intelligent Search mechanism, uses the past behavior of the P2P network to further improve the scalability of the search procedure. In this algorithm, each peer autonomously decides which of its peers are most likely to answer a given query. The algorithm is entirely distributed, and therefore scales well with the size of the network. We implemented our mechanisms as middleware platforms. To show the advantages of our mechanisms we present experimental results using the middleware implementation.

Proof: A DHT-Based Peer-to-Peer Search Engine

2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI 2006 Main Conference Proceedings)(WI'06), 2006

In this paper we focus on building a large scale keyword search service over structured Peer-to-Peer (P2P) networks. Current state-of-the-art keyword search approaches for structured P2P systems are based on inverted list intersection. However, the biggest challenge in those approaches is that when the indices are distributed over peers, a simple query may cause a large amount of data to be transmitted over the network. We propose a new P2P keyword search scheme, called "Proof", to reduce network traffic for queries. The key idea is storing a content summary for each web page in the inverted list, so that a query can be processed by only transmitting a small size of candidate results. Our simulation results showed that, compared with previous DHT-based P2P systems, Proof can dramatically reduce network traffic and computation time. It provides 100% precision and 90.09% recall of search results, at an acceptable cost of storage overhead, even when the number of peers and documents increases continually.

Model of Complex Searching Over Structured P2P Overlay under Dynamic Environment

International Journal of Information and Electronics Engineering, 2012

This paper aims to present a model to achieve complex searching (e.g. Wild card searching, blind searching, full text searching, metadata searching, etc.) over Structured P2P Overlay under dynamic environment. Lack of complex searching over Structured Overlay network and poor resilience due to peer dynamics makes it unsuitable for implementing the P2P systems and thus makes it very unpopular. In implementing the complex searching technique in structured overlay, the query must be processed by each node (peer) locally. This requires query to be efficiently broadcasted to all the nodes. However, broadcasting based algorithms tend to perform badly under highly dynamic environment and also create huge amount of maintenance traffic. This paper proposes a model which combines broadcasting based algorithm with efficient maintenance algorithm, which will reduce the maintenance traffic and increase the search efficiency.

Proof: A Novel DHT-Based Peer-to-Peer Search Engine

IEICE Transactions on Communications, 2007

In this paper we focus on building a large scale keyword search service over structured Peer-to-Peer (P2P) networks. Current stateof-the-art keyword search approaches for structured P2P systems are based on inverted list intersection. However, the biggest challenge in those approaches is that when the indices are distributed over peers, a simple query may cause a large amount of data to be transmitted over the network. We propose in this paper a new P2P keyword search scheme, called "Proof," which aims to reduce the network traffic generated during the intersection process. We applied three main ideas in Proof to reduce network traffic, including (1) using a sorted query flow, (2) storing content summaries in the inverted lists, and (3) setting a stop condition for the checking of content summaries. We also discuss the advantages and limitations of Proof, and conducted extensive experiments to evaluate the search performance and the quality of search results. Our simulation results showed that, compared with previous solutions, Proof can dramatically reduce network traffic while providing 100% precision and high recall of search results, at some additional storage overhead.

A Hybrid Query Scheme to Speed Up Queries in Unstructured Peer-to-Peer Networks

Advances in Multimedia, 2007

Unstructured peer-to-peer networks have gained a lot of popularity due to their resilience to network dynamics. The core operation in such networks is to efficiently locate resources. However, existing query schemes, for example, flooding, random walks, and interest-based shortcut suffer various problems in reducing communication overhead and in shortening response time. In this paper, we study the possible problems in the existing approaches and propose a new hybrid query scheme, which mixes inter-cluster queries and intracluster queries. Specifically, the proposed scheme works by efficiently locating the clusters, sharing similar interests with intercluster queries, and then exhaustively searching the nodes in the found clusters with intracluster queries. To facilitate the scheme, we propose a clustering algorithm to cluster nodes that share similar interests, and a labeling algorithm to explicitly capture the clusters in the underlying overlays. As demonstrated by extensive simul...

Tree-based Indexing for DHT-based P2P Systems

International Journal of Computer Applications, 2013

Nowadays, DHT-based P2P technology is used as a basis in many wide spread applications because of its scalability, robustness, and load balance. Many applications, including file sharing, communication and live video streaming are in a large distributed network environment. For an efficient and effective search in large data repositories, complex query processing becomes a major issue for DHT. Towards the goal of supporting complex queries in DHT-based P2P systems, this paper focuses on the usage of k-dimensional tree to build a tree-based index. The proposed index is built without modifying the structure of the overlay network. In this paper, the load balancing among peers is also considered according to the usage of kd-tree. Therefore the performance of kd-tree is studied and show that how it can affect the proposed index over P2P network. In this paper, PlanetSim simulator is used to implement the proposed index and evaluate the performance of the index by using various metrics.

ISE02-2: Dynamic Search Algorithm in Unstructured Peer-to-Peer Networks

Globecom, 2006

Flooding and random walk (RW) are the two typical search algorithms in unstructured peer-to-peer networks. The flooding algorithm searches the network aggressively. It covers the most nodes but generates a large number of query messages. Hence it is considered to be not scalable. This cost issue is especially serious when the queried resource locates far from the query source. On the contrary, RW searches the network conservatively. It only generates a fixed amount of query messages at each hop, but it may take particularly longer search time to find the queries resource. We propose the dynamic search algorithm (DS) which is a generalization of flooding, modified breadth first search (MBFS), and RW. This search algorithm takes advantage of different contexts under which each previous search algorithm performs well. The operation of DS resembles flooding or MBFS for the short-term search, and RW for the long-term search. We analyze the performance of DS based on the power-law random graph model and adopt some performance metrics including the guaranteed search time, query hits, query messages, success rate, and a unified metric, search efficiency. The main objective is to obtain the effects of the parameters of DS. Numerical results show that proper setting of the parameters of DS can obtain the short guaranteed search time and provide a good tradeoff between the search performance and the cost.

A Keyword Search Algorithm for Structured Peer-to-Peer Networks

… Symposium on Symbolic …, 2010

Peer-to-Peer (P2P) networks are largely used for file-sharing and hence must provide efficient mechanisms for searching the files stored at various nodes. The existing structured P2P overlays support only ”exact-match” look-up which is hardly sufficient in a filesharing network. This paper addresses the problem of keyword-based search in structured P2P networks. We propose a new keyword-based searching algorithm which can be implemented on top of any structured P2P overlay. We demonstrate that the proposed algorithm achieves very good searching results as it requires the minimum number of messages to be sent in order to find all the references to files containing at least the given set of keywords.

Efficient and scalable query routing for unstructured peer-to-peer networks

IEEE INFOCOM, 2005

Searching for content in peer-to-peer networks is an interesting and challenging problem. Queries in Gnutella-like unstructured systems that use flooding or random walk to search must visit O(n) nodes in a network of size n, thus consuming significant amounts of bandwidth. In this paper, we propose a query routing protocol that allows low bandwidth consumption during query forwarding using a low cost mechanism to create and maintain information about nearby objects. To achieve this, our protocol maintains a lightweight probabilistic routing table at each node that suggests the location of each object in the network. Following the corresponding routing table entries, a query can reach the destination in a small number of hops with high probability. However, maintaining routing tables in a large and highly dynamic network requires non-traditional mechanisms. We design a novel data structure called an exponentially decaying bloom filter (EDBF) that encodes such probabilistic routing tables in a highly compressed manner, and allows for efficient aggregation and propagation. The search primitives provided by our system can be used to search for single keys or multiple keywords with equal ease. Analytical modeling of our design predicts significant improvements in search efficiency, verified through extensive simulations in which we observed an order of magnitude reduction in query path length over previous proposals.