A Simple and Optimal Ancestry Labeling Scheme for Trees (original) (raw)

Improved ancestry labeling scheme for trees

2014

We present a lgn+2lglgn+3\lg n + 2 \lg \lg n+3lgn+2lglgn+3 ancestry labeling scheme for trees. The problem was first presented by Kannan et al. [STOC 88'] along with a simple 2lgn2 \lg n2lgn solution. Motivated by applications to XML files, the label size was improved incrementally over the course of more than 20 years by a series of papers. The last, due to Fraigniaud and Korman [STOC 10'], presented an asymptotically optimal lgn+4lglgn+O(1)\lg n + 4 \lg \lg n+O(1)lgn+4lglgn+O(1) labeling scheme using non-trivial tree-decomposition techniques. By providing a framework generalizing interval based labeling schemes, we obtain a simple, yet asymptotically optimal solution to the problem. Furthermore, our labeling scheme is attained by a small modification of the original 2lgn2 \lg n2lgn solution.

Engineering tree labeling schemes: A case study on least common ancestors

2008

We address the problem of labeling the nodes of a tree such that one can determine the identifier of the least common ancestor of any two nodes by looking only at their labels. This problem has application in routing and in distributed computing in peer-to-peer networks. A labeling scheme using Θ (log 2 n)-bit labels has been previously presented by Peleg. By engineering this scheme, we obtain a variety of data structures with the same asymptotic performances.

Informative Labeling Schemes for the Least Common Ancestor Problem

Abstract. We address the problem of labeling the nodes of a tree such that one can determine the identifier of the least common ancestor of any two nodes by looking only at their labels. This problem has application in routing and in distributed computing in peer-to-peer networks. A labeling scheme using Θ (log2 n)-bit labels has been presented by Peleg. By engineering this scheme and a new one due to the authors, we obtain a variety of data structures with the same asymptotic performances.

Average case analysis for tree labelling schemes

Theoretical Computer Science, 2007

We study how to label the vertices of a tree in such a way that we can decide the distance of two vertices in the tree given only their labels. For trees, Gavoille et al. [7] proved that for any such distance labelling scheme, the maximum label length is at least 1 8 log 2 n − O(log n) bits. They also gave a separatorbased labelling scheme that has the optimal label length Θ(log n • log(H n (T))), where Hn(T) is the height of the tree. In this paper, we present two new distance labelling schemes that not only achieve the optimal label length Θ(log n • log(H n (T))), but also have a much smaller expected label length under certain tree distributions. With these new schemes, we also can efficiently find the least common ancestor of any two vertices based on their labels only.

Algorithms of an optimal integer tree labeling

Suppose we label the vertices of a tree by positive integers. The weight of an edge is defined by a monotonically increasing function of the absolute value of the difference of the labels of its endpoints. We define the total cost of the labeling to be the sum of weight of all the edges.The problem we consider is that of determining for a given tree G and given a labeling of the leaves of G the minimum total cost labellings of G. In this paper we present an algorithm that works for any cost function satisfies the condition of monotony mentioned above. In a case of the function defined as the absolute value of the difference of the labels the fast algorithm is presented.

Labeling Schemes for Tree Representation

Algorithmica, 2009

This paper deals with compact label-based representations for trees. Consider an n-node undirected connected graph G with a predefined numbering on the ports of each node. The all-ports tree labeling ℒall gives each node v of G a label containing the port numbers of all the tree edges incident to v. The upward tree labeling ℒup labels each node v by the number of the port leading from v to its parent in the tree. Our measure of interest is the worst case and total length of the labels used by the scheme, denoted M up (T) and S up (T) for ℒup and M all (T) and S all (T) for ℒall . The problem studied in this paper is the following: Given a graph G and a predefined port labeling for it, with the ports of each node v numbered by 0,…,deg (v)−1, select a rooted spanning tree for G minimizing (one of) these measures. We show that the problem is polynomial for M up (T), S up (T) and S all (T) but NP-hard for M all (T) (even for 3-regular planar graphs). We show that for every graph G and port labeling there exists a spanning tree T for which S up (T)=O(nlog log n). We give a tight bound of O(n) in the cases of complete graphs with arbitrary labeling and arbitrary graphs with symmetric port labeling. We conclude by discussing some applications for our tree representation schemes.

Dynamic and Multi-Functional Labeling Schemes

Lecture Notes in Computer Science, 2014

We investigate labeling schemes supporting adjacency, ancestry, sibling,and connectivity queries in forests. In the course of more than 20 years, the existence of log n + O(log log n) labeling schemes supporting each of these functions was proven, with the most recent being ancestry [Fraigniaud and Korman, STOC '10]. Several multi-functional labeling schemes also enjoy lower or upper bounds of log n + Ω(log log n) or log n + O(log log n) respectively. Notably an upper bound of log n + 2 log log n for adjacency+siblings and a lower bound of log n + log log n for each of the functions siblings, ancestry, and connectivity [Alstrup et al., SODA '03]. We improve the constants hidden in the O-notation, where our main technical contribution is a log n + 2 log log n lower bound for connectivity+ancestry and connectivity+siblings. In the context of dynamic labeling schemes it is known that ancestry requires Ω(n) bits [Cohen, et al. PODS '02]. In contrast, we show upper and lower bounds on the label size for adjacency, siblings, and connectivity of 2 log n bits, and 3 log n to support all three functions. We also show that there exist no efficient dynamic adjacency labeling schemes for planar, bounded treewidth, bounded arboricity and bounded degree graphs.

A Clustering-based Scheme for Labeling XML Trees

2006

Tree labeling plays a key role in XML query processing. In this paper, we propose a new labeling scheme, called Clusteringbased Labeling. Unlike all previous labeling methods, In this labeling scheme elements are separated into various groups, and a label is assigned to a group of elements instead of one element. Based on Clustering-based Labeling we design a new relational schema, similar to OrdPath scheme, for storing XML documents in relational database. Grouping Sibling nodes into one record reduces number of relational records needed for XML document storage. Our experimental results shows that our storing scheme significantly is better than tree well-known relational XML storing methods in terms of number of stored records, document reconstruction time and query processing performance.

A relevance comparison between interval and prefix labelling schemes

2017 International Conference on Engineering and Technology (ICET), 2017

Improving XML database management system has attracted researchers to consider whether the indexing system is equivalent to a relational database management system. The indexing system is based on labelling the nodes of the XML tree. Different types of labelling scheme have been proposed to label the document quickly and without consuming too much storage space. However, most the studies focused on evaluating the performance of new labelling schemes. The appropriateness of various existing schemes to the particular structure an XML document has not been addressed sufficiently. To investigate this aspect two common XML labelling schemes were employed: Prefix (Dewey Encoding) and Interval (Containment) to label three different examples of XML documents with very different structures. The time and storage space requirements were investigated to compare the relevance of each scheme to the structures of the documents. A number of experiments were conducted and it was found that Dewey Encoding and Containment techniques are relatively fast when labelling shallow tree structures. Dewey required little storage space to save labels of wide tree structures, however, Containment used less storage space when storing the labels of short trees.

On coding labeled trees

2007

We consider the problem of coding labeled trees by means of strings of node labels. Different codes have been introduced in the literature by Prüfer, Neville, and Deo and Micikevičius. For all of them, we show that both coding and decoding can be reduced to integer (radix) sorting, closing several open problems within a unified framework that can be applied both in a sequential and in a parallel setting.