A step towards genuine declarative language-integrated queries (original) (raw)

Optimisation of language-integrated queries by query unnesting

Computer Languages, Systems & Structures, 2017

Native functional-style querying extensions for programming languages (e.g., LINQ or Java 8 streams) are widely considered as declarative. However, their very limited degree of optimisation when dealing with local collection processing contradicts this statement. We show that developers constructing complex LINQ queries or combining queries expose themselves to the risk of severe performance deterioration. For an inexperienced programmer, a way of getting an appropriate query form can be too complicated. Also, a manual query transformation is justified by the need of improving performance, but achieved at the expense of reflecting an actual business goal. As a result, benefits from a declarative form and an increased level of abstraction are lost. In this paper, we claim that moving of selected methods for automated optimisation elaborated for declarative query languages to the level of imperative programming languages is possible and desired. Our approach is based on the assumption that programmer is able distinguish whether a language-integrated query is intentionally used to introduce some side-effects or its sole purpose is to only query the data. We propose two optimisation procedures through query unnesting designed to avoid unnecessary multiple calculations in collection-processing constructs based on higher-order functions. We have implemented and verified this idea as a simple proof-of-concept LINQ optimiser library. Highlights • Selected optimization methods from query languages work with imperative languages. • Optimisation relies on higher-order functions analysis. • Factoring out of free expressions avoid unnecessary multiple calculations. • Volatile indexing procedure performs unnesting of correlated queries. • Proof-of-concept LINQ optimiser with results for sample queries is presented.

Aggregation in functional query languages

Journal of Functional and Logic …, 2004

We consider the problem of improving the computational efficiency of a functional query language. Our focus is on aggregate operations which have proven to be of practical interest in database querying. Since aggregate operations are typically non-monotonic in nature, recursive programs making use of aggregate operations must be suitably restricted in order that they have a well-defined meaning. In a recent paper we showed that partial-order clauses provide a well-structured means of formulating such queries. The present paper extends earlier work in exploring the notion of declarative pruning. By "declarative pruning" we mean that the programmer can specify declarative information about certain functions in the program without altering the meanings of these functions. Using this information, our proposed execution model provides for more efficient program execution. Essentially we require that certain domains must be totally-ordered (as opposed to being partially-ordered). Given this information, we show how the search space of solutions can be pruned efficiently. The paper presents examples illustrating the language and its computation model, and also presents a formal operational semantics. * This is a revised and expanded version of the paper

Towards Algebraic Query Optimisation for XQuery

Lecture Notes in Computer Science, 2006

XML-based databases have become a major area of interest in database research. Abstractly speaking they can be considered as a resurrection of complex-value databases using constructors for records, lists, unions plus optionality and references. XQuery has become the standard query language for XML. As XQuery is a declarative query language, the problem of query optimisation arises. In this paper an algebraic approach to query optimisation is introduced. This is based on a translation of XQuery into a query algebra for rational tree types. The algebra uses simple operations on types and structural recursion for lists. The translation exploits linguistic reflection for the type-safe expansion of path expressions. The availability of an algebraic representation of queries permits query rewriting, which in combination with cost heuristics permits queries to be rewritten and thus optimised.

Optimization of object-oriented queries addressing large and small collections

2009 International Multiconference on Computer Science and Information Technology, 2009

When a query jointly addresses very large and very small collections it may happen that an iteration caused by a query operator is driven by a large collection and in each cycle it evaluates a subquery that depends on an element of a small collection. For each such element the result returned by the subquery is the same. In effect, such a subquery is unnecessarily evaluated many times. The optimization rewrites such a query to reverse the situation: the loop is to be performed on a small collection and inside each its cycle a subquery addressing a large collection is evaluated. We illustrate the method on comprehensive examples and then present the general rewriting rule. The research follows the Stack-Based Approach to query languages having roots in the semantics of programming languages. The optimization method consists in analyzing of scoping and binding rules for names occurring in queries.

Reify your collection queries for modularity and speed!

Proceedings of the 12th annual international conference on Aspect-oriented software development - AOSD '13, 2013

Modularity and efficiency are often contradicting requirements, such that programers have to trade one for the other. We analyze this dilemma in the context of programs operating on collections. Performance-critical code using collections need often to be hand-optimized, leading to nonmodular, brittle, and redundant code. In principle, this dilemma could be avoided by automatic collection-specific optimizations, such as fusion of collection traversals, usage of indexing, or reordering of filters. Unfortunately, it is not obvious how to encode such optimizations in terms of ordinary collection APIs, because the program operating on the collections is not reified and hence cannot be analyzed.

Patterns for Program Query Optimisation

Operations on data can be classified as either queries or updates. Modern object-oriented programming languages require classes/interfaces to support a predefined set of queries. This presents a challenge for software designers, since a fixed interface can severely restrict the opportunities for optimisation. In this paper, we present two common patterns for optimising queries. The first requires specific knowledge of which query to optimise beforehand, whilst the latter provides more leeway in this regard. These patterns are commonly occurring in software, and we find numerous instances of them within the Java standard libraries.

11.Query Optimization to Improve Performance of the Code Execution

Object-Oriented Programming (OOP) is one of the most successful techniques for abstraction. Bundling together objects into collections of objects, and then operating on these collections, is a fundamental part of main stream object-oriented programming languages. Object querying is an abstraction of operations over collections, whereas manual implementations are performed at low level which forces the developers to specify how a task must be done. Some object-oriented languages allow the programmers to express queries explicitly in the code, which are optimized using the query optimization techniques from the database domain. In this regard, we have developed a technique that performs query optimization at compile-time to reduce the burden of optimization at run-time to improve the performance of the code execution.

Fast Frequent Querying with Lazy Control Flow Compilation

Theory and Practice of Logic Programming, 2007

Control flow compilation is a hybrid between classical WAM compilation and meta-call, limited to the compilation of non-recursive clause bodies. This approach is used successfully for the execution of dynamically generated queries in an inductive logic programming setting (ILP). Control flow compilation reduces compilation times up to an order of magnitude, without slowing down execution. A lazy variant of control flow compilation is also presented. By compiling code by need, it removes the overhead of compiling unreached code (a frequent phenomenon in practical ILP settings), and thus reduces the size of the compiled code. Both dynamic compilation approaches have been implemented and were combined with query packs, an efficient ILP execution mechanism. It turns out that locality of data and code is important for performance. The experiments reported in the paper show that lazy control flow compilation is superior in both artificial and real life settings.