Periodicity algorithms and a conjecture on overlaps in partial words (original) (raw)

An Answer to a Conjecture on Overlaps in Partial Words Using Periodicity Algorithms

Lecture Notes in Computer Science, 2009

We propose an algorithm that given as input a full word w of length n, and positive integers p and d, outputs (if any exists) a maximal p-periodic partial word contained in w with the property that no two holes are within distance d. Our algorithm runs in O(nd) time and is used for the study of freeness of partial words. Furthermore, we construct an infinite word over a five-letter alphabet that is overlapfree even after the insertion of an arbitrary number of holes, answering affirmatively a conjecture from Blanchet-Sadri, Mercaş, and Scott.

Periodicity on partial words

Computers & Mathematics with Applications, 2004

Made available courtesy of Elsevier: http://www.elsevier.com ***Reprinted with permission. No further reproduction is authorized without written permission from Elsevier. This version of the document is not the version of record. Figures and/or pictures may be missing from this format of the document.*** Abstract: A partial word of length n over a finite alphabet A is a partial map from {0, … , n-1} into A. Elements of {0, … , n-1} without image are called holes (a word is just a partial word without holes). A fundamental periodicity result on words due to Fine and Wilf [1] intuitively determines how far two periodic events have to match in order to guarantee a common period. This result was extended to partial words with one hole by Berstel and Boasson [2] and to partial words with two or three holes by Blanchet-Sadri and Hegstrom [3]. In this paper, we give an extension to partial words with an arbitrary number of holes.

Periodicity properties on partial words

Information and Computation, 2008

The concept of periodicity has played over the years a centra1 role in the development of combinatorics on words and has been a highly valuable too1 for the design and analysis of algorithms. Fine and Wilf's famous periodicity result, which is one of the most used and known results on words, has extensions to partia1 words, or sequences that may have a number of "do not know" symbols. These extensions fal1 into two categories: the ones that relate to strong periodicity and the ones that relate to weak periodicity. In this paper, we obtain consequences by generalizing, in particular, the combinatoria1 property that "for any word u over {a, b}, ua or ub is primitive," which proves in some sense that there exist very many primitive partia1 words.

A periodicity result of partial words with one hole

Computers & Mathematics with Applications, 2003

The study of the combinatorial properties of strings of symbols from a finite alphabet, also referred to as words, is profoundly connected to numerous fields such as biology, computer science, All rights reserved.

Fine and Wilf 's Periodicity on Partial Words and Consequences

The concept of periodicity has played over the years a central role in the development of combinatorics on words and has been a highly valuable tool for the design and analysis of algorithms. There are many fundamental periodicity results on words. Among them is the famous result of Fine and Wilf which intuitively determines how far two periodic events have to match in order to guarantee a common period. This result states that for positive integers p and q, if the word u has periods p and q and the length of u is not less than p + q − gcd(p, q), then u has also period gcd(p, q). Fine and Wilf's result, which is one of the most used and known results on words, has extensions to partial words, or sequences that may have a number of "do not know" symbols. These extensions fall into two categories: The ones that relate to strong periodicity and the ones that relate to weak periodicity. In this paper, we study some consequences of these results.

Local periods and binary partial words: an algorithm

Theoretical Computer Science, 2004

The study of the combinatorial properties of strings of symbols from a finite alphabet (also referred to as words) is profoundly connected to numerous fields such as biology, computer science, mathematics, and physics. Research in combinatorics on words goes back roughly a century. There is a renewed interest in combinatorics on words as a result of emerging new application areas such as molecular biology. Partial words were recently introduced in this context. The motivation behind the notion of a partial word is the comparison of genes (or proteins). Alignment of two genes (or two proteins) can be viewed as a construction of partial words that are said to be compatible. While a word can be described by a total function, a partial word can be described by a partial function. More precisely, a partial word of length n over a finite alphabet A is a partial function from {1, . . . , n} into A. Elements of {1, . . . , n} without an image are called holes. A word is just a partial word without holes. The notion of period of a word is central in combinatorics on words. In the case of partial words, there are two notions: one is that of period, the other is that of local period. This paper extends to partial words with one hole the well known result of Guibas and Odlyzko which states that for every word u, there exists a word v of same length as u over the alphabet {0, 1} such that the set of all periods of u coincides with the set of all periods of v. Our result states that for every partial word u with one hole, there exists a partial word v of same length as u with at most one hole over the alphabet {0, 1} such that the set of all periods of u coincides with the set of all periods of v and the set of all local periods of u coincides with the set of all local periods of v. To prove our result, we use the technique of Halava, Harju and Ilie which they used * This material is based upon work supported by the National Science Foundation under Grants CCR-9700228 and CCR-0207673. A Research Assignment from the University of North Carolina at Greensboro is gratefully acknowledged. I thank Phuongchi Thi Le for very valuable comments and suggestions. She received a research assistantship from the University of North Carolina at Greensboro to work with me on this project.

Periodic-like words, periodicity, and boxes

Acta Informatica, 2001

We introduce the notion of periodic-like word. It is a word whose longest repeated prefix is not right special. Some different characterizations of this concept are given. In particular, we show that a word w is periodic-like if and only if it has a period not larger than |w| − R w , where R w is the least non-negative integer such that any prefix of w of length ≥ R w is not right special. We derive that if a word w has two periods p, q ≤ |w| − R w , then also the greatest common divisor of p and q is a period of w. This result is, in fact, an improvement of the theorem of Fine and Wilf. We also prove that the minimal period of a word w is equal to the sum of the minimal periods of its components in a suitable canonical decomposition in periodic-like subwords. Moreover, we characterize periodic-like words having the same set of proper boxes, in terms of the important notion of root-conjugacy. Finally, some new uniqueness conditions for words, related to the maximal box theorem are given.

Computing Weak Periods of Partial Words EXTENDED ABSTRACT

Fine and Wilf's well-known theorem states that any word having periods p, q and length at least p + q − gcd(p, q) also has gcd(p, q), the greatest common divisor of p and q, as a period. Moreover, the length p + q − gcd(p, q) is critical since counterexamples can be provided for shorter words. This result has since been extended to partial words, or finite sequences that may contain a number of "do not know" symbols or "holes." More precisely, any partial word u with H holes having weak periods p, q and length at least the so-denoted l H (p, q) also has strong period gcd(p, q) provided u is not (H,(p, q))-special. This extension was done for one hole by Berstel and Boasson (where the class of (1,(p, q))-special partial words is empty), for two or three holes by Blanchet-Sadri and Hegstrom, and for an arbitrary number of holes by Blanchet-Sadri. In this paper, we further extend these results, allowing an arbitrary number of weak periods. In addition to speciality, the concepts of intractable period sets and interference between periods play a role. * This material is based upon work supported by the National Science Foundation under Grant No. DMS-0452020. We thank the referees of a preliminary version of this paper for their very valuable comments and suggestions.

Repetitions in strings: algorithms and combinatorics

The article is an overview of basic issues related to repetitions in strings, concentrating on algorithmic and combinatorial aspects. This area is important both from theoretical and practical point of view. Repetitions are highly periodic factors (substrings) in strings and are related to periodicities, regularities, and compression. The repetitive structure of strings leads to higher compression rates, and conversely, some compression techniques are at the core of fast algorithms for detecting repetitions. There are several types of repetitions in strings: squares, cubes, and maximal repetitions also called runs. For these repetitions, we distinguish between the factors (sometimes qualified as distinct) and their occurrences (also called positioned factors). The combinatorics of repetitions is a very intricate area, full of open problems. For example we know that the number of (distinct) primitively-rooted squares in a string of length n is no more than 2n − Θ(log n), conjecture to be n, and that their number of occurrences can be Θ(n log n). Similarly we know that there are at most 1.029 n and at least 0.944 n maximal repetitions and the conjecture is again that the exact bound is n. We know almost everything about the repetitions in Sturmian words, but despite the simplicity of these words, the results are nontrivial. One of the main motivations for writing this text is the development during the last couple of years of new techniques and results about repetitions. We report both the progress which has been achieved and which we expect to happen.

Relationally Periodic Sequences and Subword Complexity

Lecture Notes in Computer Science, 2008

By the famous theorem of Morse and Hedlund, a word is ultimately periodic if and only if it has bounded subword complexity, i.e., for sufficiently large n, the number of factors of length n is constant. In this paper we consider relational periods and relationally periodic sequences, where the relation is a similarity relation on words induced by a compatibility relation on letters. We investigate what would be a suitable definition for a relational subword complexity function such that it would imply a Morse and Hedlund-like theorem for relationally periodic words. We consider strong and weak relational periods and two candidates for subword complexity functions.