On database query languages for K-relations (original) (raw)

Provenance semirings

2007

We show that relational algebra calculations for incomplete databases, probabilistic databases, bag semantics and whyprovenance are particular cases of the same general algorithms involving semirings. This further suggests a comprehensive provenance representation that uses semirings of polynomials. We extend these considerations to datalog and semirings of formal power series. We give algorithms for datalog provenance calculation as well as datalog evaluation for incomplete and probabilistic databases. Finally, we show that for some semirings containment of conjunctive queries is the same as for standard set semantics.

Semiring-Annotated Data: Queries and Provenance

ABSTRACT We present an overview of the literature on querying semiring-annotated data, a notion we introduced five years ago in a paper with Val Tannen. First, we show that positive relational algebra calculations for various forms of annotated relations, as well as provenance models for such queries, are particular cases of the same general algorithm involving commutative semirings.

A query language allowing conditions of relational type in queries

Information Systems, 1985

The article deals with a certain query language of logical type, designed for use in an information system containing either complete or incomplete information. It is based on Pawlak's model of an information system. The language in question allows presentation of queries for objects with values of attributes satisfying a given k-ary relation, where k is any number not exceeding the number of attributes in the system. Expressions of the language are built up by means of Boolean operations from k-ary descriptors, representing elementary queries of the type described above. A complete system of axioms (equivalent transformation rules for queries) valid under the interpretation of the language in a complete information system is given. Based on the above axiom system, a method of computing an answer to a query of this language in an incomplete information system is described.

Algebraic Structures for Capturing the Provenance of SPARQL Queries

Abstract We show that the evaluation of SPARQL algebra queries on various notions of annotated RDF graphs can be seen as particular cases of the evaluation of these queries on RDF graphs annotated with elements of so-called spm-semirings. Spm-semirings extend semirings, used for positive relational algebra queries on annotated relational data, with a new operator to capture the semantics of the non-monotone SPARQL operator OPTIONAL.

Foundations of the theory of relational database models

Cybernetics and Systems Analysis, 1996

The present article discusses the foundations of the relational approach to database models, which goes back to Codd's work [1-6] (see also surveys [i-9] and monographs [10-16]). The state of the art in this area of research is characterized by a large number of scattered results. Further progress is constrained by three missing components: a unified methodological and conceptual base, formalization of the main paradigms, and an understanding of the interrelationships between the corresponding formalizations. The next stage in the development of relational databases thus should involve a shift of emphasis from extensional to intensional approach to the subject. Certain steps in this direction are presented in our paper, whose goal can be summarized as follows: use the principles of programmology developed in the framework of composition programming [ 17-20] to provide adequate formalizations of the main paradigms for the construction of a substantive theory of relational databases while remaining on a sufficiently high level of abstraction. Database models, and in particular relational models, have at least two major aspects: the information aspect associated with formalization of data structures, and the manipulation aspect, associated with formalization of operations on these structures. Joint formalization of these aspects reduces to the construction of a corresponding algebra of data structures. Relation is the key concept of the relational approach to databases, and it is therefore the first that must be formalized. Like any natural intuitive concept, the database relation potentially admits a variety of different formalizations. The first formalization of database relation, following Codd, was the classical view of a finite-place logical relation as a subset of the Cartesian product of sets taken in a certain order (in other words, a relation is a set of ordered n-tuples of fixed length, n = I, 2 ....). This formalization was in fact responsible for the initial success and popularity of the relational approach to databases. Indeed, informal concepts could be formalized and analyzed using a well-developed formal apparatus (theory of logical relations, fragments of languages of first-order predicate logic, etc.). Yet even this traditionally established formalization of the concept of database relation is not free from weaknesses, which lead to a number of irrelevancies. These irrelevancies include the following: first, the information content of a database relation is usually independent of the order of the components, whereas for logical relations this order is essential (thus, for instance, it is not always meaningful to consider the inverse of a given database relation, by analogy with the concept of symmetry); second, the use of standard names l, 2 .... to access the components of a database relation (tuples in this case) is quite burdensome and leads to certain complications, such as the appearance in the metalanguage, but not in the object language, of arbitrary ("mnemonic") names denoting the standard names; moreover, with the exception of set-theoretical operations, the only natural many-place operation on database relations under this formalization is the so-called Cartesian product (not to be confused with the eponymous set-theoretical operatioi,), and all other natural operations (for instance, 0-join) are introduced as parametric operations on the standard names of the components. Also note the following typical twist: the operation of multiplication of (binary) logical relations, which is the core of relational algebra (see, for instance, [21-23]), does not play the same role for the general case of database relations, but it is of course simulated by the 0-join with ordinary equality acting as the parameter 0. No less important is the fact that the common database management systems (DBMS) are not strictly relational in the sense of the above formalization, because they admit arbitrary component names (see, e.g., [24]).

A normal form for relational databases that is based on domains and keys

ACM Transactions on Database Systems, 1981

A new normal form for relational databases, called domain-key normal form (DK/NF), is defined. Also, formal definitions of insertion anomaly and deletion anomaly are presented. It is shown that a schema is in DK/NF if and only if it has no insertion or deletion anomalies. Unlike previously defined normal forms, DK/NF is not defined in terms of traditional dependencies (functional, multivalued, or join). Instead, it is defined in terms of the more primitive concepts of domain and key, along with the general concept of a "constraint."

A Relational Algebra for Negative Databases

2007

A negative database is a representation of all elements not contained in a given database. A negative database can enhance the privacy of sensitive information without resorting to encryption. This can be useful in settings where encryption is too expensive, e.g., some sensor networks, or for applications where searches or other operations on stored data are desired. The original negative database framework supported only authentication queries and operations for modifying data, such as insert and delete. This paper extends that work by defining a set of relational operators for negative representations. For each relational operator, the corresponding negative operator is defined such that the result of the negative operator applied to a negative representation is equivalent to the positive version applied to the positive representation. Algorithms for each relational operator are described and compared to its positive counterpart. This work enhances the practicality of negative dat...

A Conservative Property of a Nested Relational Query Language

1992

We proposed in [7] a nested relational calculus and a nested relational algebra based on structural recursion [6, 5] and on monads [27, 16]. In this report, we describe relative set abstraction as our third nested relational query language. This query language is similar to the well known list comprehension mechanism in functional programming languages such as Haskell [ll], Miranda [24], KRC [23], etc. This language is equivalent to our earlier query languages both in terms of semantics and in terms of equational theories ...

A multi-set extended relational algebra: a formal approach to a practical issue

1994

Abstract The relational data model is based on sets of tuples, ie it does not allow duplicate tuples an a relation. Many database languages and systems do require multi-set semantics though, either because of functional requirements or because of the high costs of duplicate removal in database operations. Several proposals have been presented that discuss multi-set semantics.

A straightforward formalization of the relational model

Information Systems, 1985

There has been a lot of recent interest in the formalization of the relational data model (RDM). Many approaches may be characterized as ones oriented mainly towards declaring the components of the RDM and their interrelationships. Other approaches provide also a tool for manipulating the components of RDM so that research topics on the model can be specified exactly. Usually the latter approaches are based on formal specification methods such as denotational semantics or abstract data types. However some in the data base community find them quite complex and cumbersome. The goal of the approach of this paper is of the latter kind. However, special attention is being paid to avoid the complexity of the formal specification methods because our notations and definitions are based on set theory. We attempt to provide an exact, convenient and general tool for specifications and proofs concerning various topics like relational query languages, query optimization, relational data base restructuring, data base design, etc.