Probabilistic Voronoi Diagrams for Probabilistic Moving Nearest Neighbor Queries (original) (raw)

Efficient processing of probabilistic reverse nearest neighbor queries over uncertain data

VLDB JOURNAL, 2009

Reverse nearest neighbor (RNN) search is very crucial in many real applications. In particular, given a database and a query object, an RNN query retrieves all the data objects in the database that have the query object as their nearest neighbors. Often, due to limitation of measurement devices, environmental disturbance, or characteristics of applications (for example, monitoring moving objects), data obtained from the real world are uncertain (imprecise). Therefore, previous approaches proposed for answering an RNN query over exact (precise) database cannot be directly applied to the uncertain scenario. In this paper, we re-define the RNN query in the context of uncertain databases, namely probabilistic reverse nearest neighbor (PRNN) query, which obtains data objects with probabilities of being RNNs greater than or equal to a user-specified threshold. Since the retrieval of a PRNN query requires accessing all the objects in the database, which is quite costly, we also propose an effective pruning method, called geometric pruning (GP), that significantly reduces the PRNN search space yet without introducing any false dismissals. Furthermore, we present an efficient PRNN query procedure that seamlessly integrates our pruning method. Extensive experiments have demonstrated the efficiency and effectiveness of our proposed GP-based PRNN query processing approach, under various experimental settings.

Voronoi-based Nearest Neighbor Search for Multi-Dimensional Uncertain Databases

In Voronoi-based nearest neighbor search, the Voronoi cell of every point p in a database can be used to check whether p is the closest to some query point q. We extend the notion of Voronoi cells to support uncertain objects, whose attribute values are inexact. Particularly, we propose the Possible Voronoi cell (or PV-cell). A PV-cell of a multi-dimensional uncertain object o is a region R, such that for any point p ∈ R, o may be the nearest neighbor of p. If the PV-cells of all objects in a database S are known, they can be used to identify objects that have a chance to be the nearest neighbor of q. However, there is no efficient algorithm for computing an exact PV-cell. We hence study how to derive an axis-parallel hyper-rectangle (called the Uncertain Bounding Rectangle, or UBR) that tightly contains a PV-cell. We further develop the PV-index, a structure that stores UBRs, to evaluate probabilistic nearest neighbor queries over uncertain data. An advantage of the PVindex is that upon updates on S, it can be incrementally updated. Extensive experiments on both synthetic and real datasets are carried out to validate the performance of the PV-index.

Efficient probabilistic reverse nearest neighbor query processing on uncertain data

Proceedings of the VLDB Endowment, 2011

Given a query object q , a reverse nearest neighbor (RNN) query in a common certain database returns the objects having q as their nearest neighbor. A new challenge for databases is dealing with uncertain objects. In this paper we consider probabilistic reverse nearest neighbor (PRNN) queries, which return the uncertain objects having the query object as nearest neighbor with a sufficiently high probability. We propose an algorithm for efficiently answering PRNN queries using new pruning mechanisms taking distance dependencies into account. We compare our algorithm to state-of-the-art approaches recently proposed. Our experimental evaluation shows that our approach is able to significantly outperform previous approaches. In addition, we show how our approach can easily be extended to PR k NN (where k > 1) query processing for which there is currently no efficient solution.

Voronoi-based reverse nearest neighbor query processing on spatial networks

Multimedia Systems, 2009

The use of Voronoi diagram has traditionally been applied to computational geometry and multimedia problems. In this paper, we will show how Voronoi diagram can be applied to spatial query processing, and in particular to Reverse Nearest Neighbor (RNN) queries. Spatial and geographical query processing, in general, and RNN in particular, are becoming more important, as online maps are now

Probabilistic Nearest-Neighbor Query on Uncertain Objects

2007

Nearest-neighbor queries are an important query type for commonly used feature databases. In many different application areas, e.g. sensor databases, location based services or face recognition systems, distances between objects have to be computed based on vague and uncertain data. A successful approach is to express the distance between two uncertain objects by probability density functions which assign a probability value to each possible distance value. By integrating the complete probabilistic distance function as a whole directly into the query algorithm, the full information provided by these functions is exploited. The result of such a probabilistic query algorithm consists of tuples containing the result object and a probability value indicating the likelihood that the object satisfies the query predicate. In this paper we introduce an efficient strategy for processing probabilistic nearest-neighbor queries, as the computation of these probability values is very expensive. In a detailed experimental evaluation, we demonstrate the benefits of our probabilistic query approach. The experiments show that we can achieve high quality query results with rather low computational cost.

Probabilistic Reverse Nearest Neighbor Queries on Uncertain Data

IEEE Transactions on Knowledge and Data Engineering, 2010

Uncertain data is inherent in various important applications and reverse nearest neighbor (RNN) query is an important query type for many applications. While many different types of queries have been studied on uncertain data, there is no previous work on answering RNN queries on uncertain data. In this paper, we formalize probabilistic reverse nearest neighbor query that is to retrieve the objects from the uncertain data that have higher probability than a given threshold to be the RNN of an uncertain query object. We develop an efficient algorithm based on various novel pruning approaches that solves the probabilistic RNN queries on multidimensional uncertain data. The experimental results demonstrate that our algorithm is even more efficient than a sampling-based approximate algorithm for most of the cases and is highly scalable.

Indexing probabilistic nearest-neighbor threshold queries

2008

Data uncertainty is inherent in many applications, including sensor networks, scientific data management, data integration, locationbased applications, etc. One of common queries for uncertain data is the probabilistic nearest neighbor (PNN) query that returns all uncertain objects with non-zero probabilities to be NN. In this paper we study the PNN query with a probability threshold (PNNT), which returns all objects with the NN probability greater than the threshold. Our PNNT query removes the assumption in all previous papers that the probability of an uncertain object always adds up to 1, i.e., we consider missing probabilities. We propose an augmented R-tree index with additional probabilistic information to facilitate pruning as well as global data structures for maintaining the current pruning status. We present our algorithm for efficiently answering PNNT queries and perform experiments to show that our algorithm significantly reduces the number of objects that need to be further evaluated as NN candidates.

Efficient search for the top-k probable nearest neighbors in uncertain databases

Proceedings of The Vldb Endowment, 2008

Uncertainty pervades many domains in our lives. Current real-life applications, e.g., location tracking using GPS devices or cell phones, multimedia feature extraction, and sensor data management, deal with different kinds of uncertainty. Finding the nearest neighbor objects to a given query point is an important query type in these applications.

SPHLU: An Efficient Algorithm for Processing PRkNN Queries on Uncertain Data

Chinese Journal of Electronics, 2016

Query on uncertain data has received much attention in recent years, especially with the development of Location-based services (LBS). Little research is focused on reverse k nearest neighbor queries on uncertain data. We study the Probabilistic reverse k nearest neighbor (PRkNN) queries on uncertain data. It is succinctly shown that, PRkNN query retrieves all the points that have higher probabilities than a given threshold value to be the Reverse k-nearest neighbor (RkNN) of query data Q. The previous works on this topic mostly process with k > 1. Some algorithms allow the cases for k > 1, but the efficiency is inefficient especially for large k. We propose an efficient pruning algorithm-Spatial pruning heuristic with louer and upper bound (SPHLU) for solving the PRkNN queries for k > 1. The experimental results demonstrate that our algorithm is even more efficient than the existent algorithms especial for a large value of k.

Nearest-neighbor searching under uncertainty

Proceedings of the 31st symposium on Principles of Database Systems - PODS '12, 2012

Nearest-neighbor queries, which ask for returning the nearest neighbor of a query point in a set of points, are important and widely studied in many fields because of a wide range of applications. In many of these applications, such as sensor databases, location based services, face recognition, and mobile data, the location of data is imprecise. We therefore study nearest neighbor queries in a probabilistic framework in which the location of each input point and/or query point is specified as a probability density function and the goal is to return the point that minimizes the expected distance, which we refer to as the expected nearest neighbor (ENN). We present methods for computing an exact ENN or an εapproximate ENN, for a given error parameter 0 < ε < 1, under different distance functions. These methods build an index of near-linear size and answer ENN queries in polylogarithmic or sublinear time, depending on the underlying function. As far as we know, these are the first nontrivial methods for answering exact or ε-approximate ENN queries with provable performance guarantees.