Undecidable boundedness problems for datalog programs (original) (raw)

Undecidable optimization problems for database logic programs

Journal of the ACM, 1993

Datalog is the language of logic programs without function symbols. It is used as a database query language. If it is possible to eliminate recursion from a Datalog program F', then t' is said to be bounded. It is shown that the problem of deciding whether a given Datalog program is bounded is undecidable, even for linear programs (i.e., programs in which each rule contains at most one occurrence of a recursive predicate). It is then shown that every semantic property of Datalog programs is undecidable if it is stable, is strongly nontrivial, and contains An earlier version of this work appeared under the same title in the Proceedings of the 2nd IEEE Symposium on Logic i~z Computer

Increment boundedness and nonrecursive incremental evaluation of datalog queries

Lecture Notes in Computer Science, 1995

Given a recursive (datalog) query, the nonrecursive incremental evaluation approach uses nonrecursive (datalog) programs to compute the difference of the answers to the query against successive databases between updates. The mechanism used in this approach is called a "First-Order Incremental Evaluation System" (FOIES). We show that for two large classes of datalog queries, called "generalized (weakly) regular queries", FOIES always exist. We also define "increment boundedness" and its variations, which generalize boundedness. Increment bounded queries are shown to have FOIES of certain forms. We also relate increment boundedness to structural recursion, which was proposed for bulk data types. We characterize increment boundednessusing the "insertion idempotency", "insertion commutativity", and "determinism" properties of structural recursion. Finally, we show that the increment boundedness notions are undecidable and a decidable sufficient condition is given.

The Expressive Powers of Stable Models for Bound and Unbound DATALOG Queries

Journal of Computer and System Sciences, 1997

Various types of stable models are known in the literature: T-stable (total stable), P-stable ( partial stable, also called three-valued stable), M-stable (maximal stable, also known under various different names), and L-stable (least undefined stable). For each type of stable model, the paper analyzes two versions of deterministic semantics: possible semantics, which is based on the union of all stable models of the given type, and definite semantics, which is instead based on their intersection and is like classical certain semantics except that it makes no inference if no model exists. For total stable models, which are the only type of stable models whose existence is not guaranteed for every program, certain semantics is taken into account as well. The expressive powers of each type of stable model under the above versions of semantics are investigated for both bound (i.e., ground) and unbound queries on DATALOG programs with negation. As deterministic semantics is argued to be inappropriate for unbound queries, a nondeterministic semantics is also proposed for them and its expressive power is fully characterized as well. ] 1997 Academic Press * Work partially supported by the ECUS033 project``DEUS EX MACHINA: Non-determinism in deductive databases'' and by a MURST grant (40 0 share) under the project``Sistemi formali e strumenti per basi di dati evolute''. An extended abstract of the preliminary results about bound queries appears in the informal proceedings of the Workshop oǹ`S tructural Complexity and Recursion-Theoretic Methods in Logic Programming'' (Vancouver, October 1993) and an extended abstract of the preliminary results about unbound queries appears in the proceedings of the conference ICDT'95 (Prague, January 1995).

Linearisability on datalog programs

Theoretical Computer Science, 2003

Linear Datalog programs are programs whose clauses have a t m o s t o n e i n tensional atom in their bodies. We explore syntactic classes of Datalog programs (syntactically non-linear) which turn out to express no more than the queries expressed by linear Datalog programs. In particular, we i n vestigate linearisability of (database queries corresponding to) piecewise linear Datalog programs and chain queries: a) We prove that piecewise linear Datalog programs can always be transformed into linear Datalog programs, by virtue of a procedure which performs the transformation automatically. The procedure relies upon conventional logic program transformation techniques. b) We identify a new class of linearisable chain queries, referred to as pseudoregular, and prove their linearisability constructively, b y generating, for any given pseudo-regular chain query, the Datalog program corresponding to it.

Circumscribing DATALOG: Expressive Power and Complexity

Theoretical Computer Science, 1998

In this paper we study a generalization of DATALOG, the language of function-free definite clauses. It is known that standard DATALOG semantics (i.e., least Herbrand model semantics) can be obtained by regarding programs as theories to be circumscribed with all predicates to be minimized. The extension proposed here, called DATALOG~!~~, consists in considering the general form of circumscription, where some predicates are minimized, some predicates are fixed, and some vary. We study the complexity and the expressive power of the language thus obtained. We show that this language (and, actually, its non-recursive fragment) is capable of expressing all the queries in DB-co-m and, as such, is much more powerful than standard DATALOG, whose expressive power is limited to a strict subset of PTIME queries. Both data and combined complexities of answering DATALOGCIRC queries are studied. Data complexity is proved to be co-NP-complete. Combined complexity is shown to be in general hard for co-NE and complete for co-NE in the case of Herbrand bases containing k distinct constant symbols, where k is bounded.

Decidable containment of recursive queries

2002

One of the most important reasoning tasks on queries is checking containment, ie, verifying whether one query yields necessarily a subset of the result of another one. Query containment, is crucial in several contexts, such as query optimization, query reformulation, knowledge-base verification, information integration, integrity checking, and cooperative answering. Containment is undecidable in general for Datalog, the fundamental language for expressing recursive queries.

Decidability and undecidability results for the termination problem of active database rules

Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems - PODS '98, 1998

Active database systems enhance the functionality of traditional databases through the use of active rules or`triggers'. One of the principal questions for such systems is that of termination { is it possible for the rules to recursively activate one another inde nitely, given an initial triggering event.

Efficiently computable datalog^E programs

Datalog ∃ is the extension of Datalog, allowing existentially quantified variables in rule heads. This language is highly expressive and enables easy and powerful knowledge-modeling, but the presence of existentially quantified variables makes reasoning over Datalog ∃ undecidable, in the general case. The results in this paper enable powerful, yet decidable and efficient reasoning (query answering) on top of Datalog ∃ programs. On the theoretical side, we define the class of parsimonious Datalog ∃ programs, and show that it allows of decidable and efficiently-computable reasoning. Unfortunately, we can demonstrate that recognizing parsimony is undecidable. However, we single out Shy, an easily recognizable fragment of parsimonious programs, that significantly extends both Datalog and Linear-Datalog ∃ , while preserving the same (data and combined) complexity of query answering over Datalog, although the addition of existential quantifiers. On the practical side, we implement a bottom-up evaluation strategy for Shy programs inside the DLV system, enhancing the computation by a number of optimization techniques to result in DLV ∃ -a powerful system for answering conjunctive queries over Shy programs, which is profitably applicable to ontology-based query answering. Moreover, we carry out an experimental analysis, comparing DLV ∃ against a number of stateof-the-art systems for ontology-based query answering. The results confirm the effectiveness of DLV ∃ , which outperforms all other systems in the benchmark domain.