Informative Labeling Schemes for the Least Common Ancestor Problem (original) (raw)

Engineering tree labeling schemes: A case study on least common ancestors

2008

We address the problem of labeling the nodes of a tree such that one can determine the identifier of the least common ancestor of any two nodes by looking only at their labels. This problem has application in routing and in distributed computing in peer-to-peer networks. A labeling scheme using Θ (log 2 n)-bit labels has been previously presented by Peleg. By engineering this scheme, we obtain a variety of data structures with the same asymptotic performances.

A Simple and Optimal Ancestry Labeling Scheme for Trees

Lecture Notes in Computer Science, 2015

We present a lg n`2 lg lg n`3 ancestry labeling scheme for trees. The problem was first presented by Kannan et al. [STOC 88'] along with a simple 2 lg n solution. Motivated by applications to XML files, the label size was improved incrementally over the course of more than 20 years by a series of papers. The last, due to Fraigniaud and Korman [STOC 10'], presented an asymptotically optimal lg n`4 lg lg nÒ p1q labeling scheme using non-trivial tree-decomposition techniques. By providing a framework generalizing interval based labeling schemes, we obtain a simple, yet asymptotically optimal solution to the problem. Furthermore, our labeling scheme is attained by a small modification of the original 2 lg n solution.

Average case analysis for tree labelling schemes

Theoretical Computer Science, 2007

We study how to label the vertices of a tree in such a way that we can decide the distance of two vertices in the tree given only their labels. For trees, Gavoille et al. [7] proved that for any such distance labelling scheme, the maximum label length is at least 1 8 log 2 n − O(log n) bits. They also gave a separatorbased labelling scheme that has the optimal label length Θ(log n • log(H n (T))), where Hn(T) is the height of the tree. In this paper, we present two new distance labelling schemes that not only achieve the optimal label length Θ(log n • log(H n (T))), but also have a much smaller expected label length under certain tree distributions. With these new schemes, we also can efficiently find the least common ancestor of any two vertices based on their labels only.

Improved ancestry labeling scheme for trees

2014

We present a lgn+2lglgn+3\lg n + 2 \lg \lg n+3lgn+2lglgn+3 ancestry labeling scheme for trees. The problem was first presented by Kannan et al. [STOC 88'] along with a simple 2lgn2 \lg n2lgn solution. Motivated by applications to XML files, the label size was improved incrementally over the course of more than 20 years by a series of papers. The last, due to Fraigniaud and Korman [STOC 10'], presented an asymptotically optimal lgn+4lglgn+O(1)\lg n + 4 \lg \lg n+O(1)lgn+4lglgn+O(1) labeling scheme using non-trivial tree-decomposition techniques. By providing a framework generalizing interval based labeling schemes, we obtain a simple, yet asymptotically optimal solution to the problem. Furthermore, our labeling scheme is attained by a small modification of the original 2lgn2 \lg n2lgn solution.

Labeling Schemes for Tree Representation

Algorithmica, 2009

This paper deals with compact label-based representations for trees. Consider an n-node undirected connected graph G with a predefined numbering on the ports of each node. The all-ports tree labeling ℒall gives each node v of G a label containing the port numbers of all the tree edges incident to v. The upward tree labeling ℒup labels each node v by the number of the port leading from v to its parent in the tree. Our measure of interest is the worst case and total length of the labels used by the scheme, denoted M up (T) and S up (T) for ℒup and M all (T) and S all (T) for ℒall . The problem studied in this paper is the following: Given a graph G and a predefined port labeling for it, with the ports of each node v numbered by 0,…,deg (v)−1, select a rooted spanning tree for G minimizing (one of) these measures. We show that the problem is polynomial for M up (T), S up (T) and S all (T) but NP-hard for M all (T) (even for 3-regular planar graphs). We show that for every graph G and port labeling there exists a spanning tree T for which S up (T)=O(nlog log n). We give a tight bound of O(n) in the cases of complete graphs with arbitrary labeling and arbitrary graphs with symmetric port labeling. We conclude by discussing some applications for our tree representation schemes.

Dynamic and Multi-Functional Labeling Schemes

Lecture Notes in Computer Science, 2014

We investigate labeling schemes supporting adjacency, ancestry, sibling,and connectivity queries in forests. In the course of more than 20 years, the existence of log n + O(log log n) labeling schemes supporting each of these functions was proven, with the most recent being ancestry [Fraigniaud and Korman, STOC '10]. Several multi-functional labeling schemes also enjoy lower or upper bounds of log n + Ω(log log n) or log n + O(log log n) respectively. Notably an upper bound of log n + 2 log log n for adjacency+siblings and a lower bound of log n + log log n for each of the functions siblings, ancestry, and connectivity [Alstrup et al., SODA '03]. We improve the constants hidden in the O-notation, where our main technical contribution is a log n + 2 log log n lower bound for connectivity+ancestry and connectivity+siblings. In the context of dynamic labeling schemes it is known that ancestry requires Ω(n) bits [Cohen, et al. PODS '02]. In contrast, we show upper and lower bounds on the label size for adjacency, siblings, and connectivity of 2 log n bits, and 3 log n to support all three functions. We also show that there exist no efficient dynamic adjacency labeling schemes for planar, bounded treewidth, bounded arboricity and bounded degree graphs.

New Ideas on Labeling Schemes

With ever increasing size of graphs, many distributed graph systems emerged to store, preprocess and analyze them. While such systems ease up conges- tion on servers, they incur certain penalties compared to centralized data structure. First, the total storage required to store a graph in a distributed fashion increases. Second, attempting to answer queries on vertices of a graph stored in a distributed fashion can be significantly more complicated. In order to lay theoretical foundations to the first penalty mentioned a large body of work concentrated on labeling schemes. A labeling scheme is a method of distributing the information about the structure of a graph among its vertices by assigning short labels, such that a selected function on vertices can be computed using only their labels. Using labeling schemes, specific queries can be determined using little communication and good run- ning times, effectively eliminating the second penalty mentioned. We continue this theoretical study in several ways. First, we dedicate a large part of the thesis to the graph family of trees, for which we pro- vide an overview of labeling schemes supporting several important functions such as ancestry, routing and especially adjacency. The survey is com- plemented by novel contributions to this study, among which are the first asymptotically optimal adjacency labeling scheme for bounded degree trees, improved bounds on ancestry labeling schemes, dynamic multifunctional la- beling schemes and an experimental evaluation of fully dynamic labeling schemes. Due to a connection between adjacency labeling schemes and the graph theoretical study of induced universal graphs, we study these in depth and show novel results for bounded degree graphs and power-law graphs. We also survey and make progress on the related implicit representation conjecture. Finally, we extend the concept of labeling schemes to allow for a better understanding of the space cost incurred by information dissemination.

Algorithms of an optimal integer tree labeling

Suppose we label the vertices of a tree by positive integers. The weight of an edge is defined by a monotonically increasing function of the absolute value of the difference of the labels of its endpoints. We define the total cost of the labeling to be the sum of weight of all the edges.The problem we consider is that of determining for a given tree G and given a labeling of the leaves of G the minimum total cost labellings of G. In this paper we present an algorithm that works for any cost function satisfies the condition of monotony mentioned above. In a case of the function defined as the absolute value of the difference of the labels the fast algorithm is presented.

On coding labeled treesI

We consider the problem of coding labeled trees by means of strings of node labels. Different codes have been introduced in the literature by Pr¨ ufer, Neville, and Deo and Micikeviˇ cius. For all of them, we show that both coding and decoding can be reduced to integer (radix) sorting, closing several open problems within a unified framework that can be applied both in a sequential and in a parallel setting. Our sequential coding and decoding schemes require optimal O(n) time when applied to n-node trees, yielding the first linear time decoding algorithm for a code presented by Neville. These schemes can be parallelized on the EREW PRAM

On coding labeled trees

2007

We consider the problem of coding labeled trees by means of strings of node labels. Different codes have been introduced in the literature by Prüfer, Neville, and Deo and Micikevičius. For all of them, we show that both coding and decoding can be reduced to integer (radix) sorting, closing several open problems within a unified framework that can be applied both in a sequential and in a parallel setting.