Logic Programming in HANSEI (original) (raw)

We introduce logic programming in Hansei on the classical example of the appendrelation. Append relates three lists l1, l2 and l3 such that l3 is the concatenation of l1 and l2. All throughout we take our lists to be lists of booleans. We will contrast Prolog code with Hansei code, developing both incrementally and interactively. The responses of Prolog resp. OCaml top-level are indented.

In Prolog, the append relation is stated as:

 append([],L,L).
 append([H|T],L,[H|R]) :- append(T,L,R).

declaring that an empty list is a prefix of any list, a list is a suffix of itself, and prepending to the prefix of a list prepends to the list. Certainly append can concatenate two lists:

 ?- append([t,t,t],[f,f],X).
      X = [t, t, t, f, f].

By passing a free logic variable as one of the two first arguments, we can concatenate not fully-determined lists:

 ?- append([t,t,t],X,R).
      R = [t, t, t|X].

We see the single result, which stands for every list with the prefix [t,t,t]. Such a compact representation of an infinite set is an asset. Alas, it is not always available, and is often a mirage. For example, if we pass a free variable as the first argument ofappend, Prolog responds differently:

 ?- append(X,[f,f],R).
      R = [f, f] ;
      R = [_G328, f, f] ;
      R = [_G328, _G334, f, f] ;
      R = [_G328, _G334, _G340, f, f] .

with an infinite set of answers, to be enumerated indefinitely. It may not be fully apparent that our Prolog code has not faithfully represented the original problem: to relate boolean lists. The last two answers describe lists of more than mere booleans. We have to impose the restriction, by defining the type predicates

 bool(t).
 bool(f).
 boollist([]).
 boollist([H|T]) :- bool(H), boollist(T).

and re-writing our queries:

 ?- append([t,t,t],X,R), boollist(X), boollist(R).
      R = [t, t, t] ;
      R = [t, t, t, t] ;
      R = [t, t, t, t, t] ;
      R = [t, t, t, t, t, t] ;
      R = [t, t, t, t, t, t, t, t, t] .
 
 ?- append(X,[f,f],R), boollist(X), boollist(R).
      R = [f, f] ;
      R = [t, f, f] ;
      R = [f, f, f] .

One of boollist(X) or boollist(R) would have been enough: if X is a boolean list, so is R. Alas, Prolog has no type inference and is unable to infer or take advantage of that fact. To be safe, we add both predicates. The compact representation for the lists with the [t, t, t] prefix has disappeared. More seriously, the default depth-first search strategy of Prolog gives us only half of the answers; we won't ever see the lists with an f after the first three t.

We switch to Hansei. In a typed language, types -- the specification of the problem -- come first. We start by defining the type blist of boolean lists with a non-deterministic spine, along with the convenient constructors:

 type bl = Nil | Cons of bool * blist
 and blist = unit -> bl;;
 
 let nil  : blist = fun () -> Nil;;
 let cons : bool -> blist -> blist = fun h t () -> Cons (h,t);;

and the conversion to ordinary, fully deterministic lists, to see the results.

 let rec list_of_blist bl = match bl () with
   | Nil        -> []
   | Cons (h,t) -> h:: list_of_blist t;;
 
     val list_of_blist : blist -> bool list = <fun>

We now define append, seemingly as a function, in the fully standard, also declarative, way.

 let rec append l1 l2 = match l1 () with
   | Nil        -> l2
   | Cons (h,t) -> cons h (fun () -> append t l2 ());;
 
     val append : blist -> blist -> blist = <fun>

An attempt to concatenate two sample lists:

 let t3 = cons true (cons true (cons true nil));;
 let f2 = cons false (cons false nil);;
 append t3 f2;;
     - : blist = <fun>

turns out not informative. Recall that we use the Hansei library to build a probabilistic model, which we then have to run. Running the model determines the set of possible worlds consistent with the probabilistic model: the model of the model. The set of model outputs in these worlds is the set of answers. Hansei offers a number of ways to run models and obtain the answers and their weights. We will be using iterative deepening:

 val reify_part : int option -> (unit -> 'a) -> (Ptypes.prob * 'a) list

The first argument is the depth search bound (infinite, if None).

 reify_part None (fun () -> list_of_blist (append t3 f2));;
     - : (Ptypes.prob * bool list) list = [(1., [true; true; true; false; false])]

Running our first append model has given the expected result, with the expected weight. Following along the Prolog code, we turn to appending to an indeterminate list.

Indeterminate lists are represented by generators. Hence, we need generators for bool and bool list domains. The generators must use letlazy, to delay the generation until the value is needed; once the value has been determined, it stays the same.

 let a_boolean () = letlazy (fun () -> flip 0.5);;
     val a_boolean : unit -> unit -> bool = <fun>
 let rec a_blist () = letlazy (fun () -> dist [(0.5, Nil); (0.5, Cons(flip 0.5, a_blist ()))]);;
     val a_blist : unit -> blist = <fun>

Let us see a sample of generated values, as a spot-check:

 reify_part (Some 5) (fun() -> list_of_blist (a_blist ()));;
     - : (Ptypes.prob * bool list) list =
       [(0.5,     []); 
        (0.125,   [false]); 
        (0.03125, [false; false]);
        (0.03125, [false; true]); 
        (0.125,   [true]);
        (0.03125, [true; false]);
        (0.03125, [true; true])]

The lists of different lengths have different probabilities, or different likelihoods of encountering them. We could use the accumulated probability mass as the measure of the explored search space, terminating the search once a threshold is reached. One is reminded of probabilistic algorithms such as the Rabin-Miller primality test.

We are now ready to reproduce the Prolog code, passing a `free' variable as an argument to append and obtaining a sequence of lists with the given prefix or suffix. Unlike Prolog, we are not stuck in a corner of the search space. We generate all lists with the given prefix [true; true; true].

 reify_part (Some 5) (fun() ->
  let x = a_blist () in
  list_of_blist (append t3 x));;
     - : (Ptypes.prob * bool list) list =
      [(0.5,     [true; true; true]); 
       (0.125,   [true; true; true; false]);
       (0.03125, [true; true; true; false; false]);
       (0.03125, [true; true; true; false; true]);
       (0.125,   [true; true; true; true]);
       (0.03125, [true; true; true; true; false]);
       (0.03125, [true; true; true; true; true])]
 
 reify_part (Some 5) (fun() ->
  let x = a_blist () in
  list_of_blist (append x f2));;
     - : (Ptypes.prob * bool list) list =
      [(0.5,     [false; false]); 
       (0.125,   [false; false; false]);
       (0.03125, [false; false; false; false]);
       (0.03125, [false; true; false; false]); 
       (0.125,   [true; false; false]);
       (0.03125, [true; false; false; false]);
       (0.03125, [true; true; false; false])]

Prolog's append is so elegant because it defines a relation among three lists. One can specify any two lists and query for the other one that makes the relation hold. For example, we check if a given list has a given prefix, and if so, remove it:

 ?- append([t,t],X,[t,t,t,f,f]).
     X = [t, f, f].

We can do the same in Hansei, effectively `running append backwards.' We introduce a sample list to deconstruct and a function two compare two lists. It is an ordinary function, not a built-in.

 let t3f2 = append t3 f2;;
 
 let rec bl_compare l1 l2 = match (l1 (), l2 ()) with
  | (Nil,Nil)  -> true
  | (Cons (h1,t1), Cons (h2,t2)) -> h1 = h2 && bl_compare t1 t2
  | _          -> false;;
     val bl_compare : blist -> blist -> bool = <fun>

The prefix-removal example is as follows. In words: we assert that there exists a boolean list x such that appending it to [true; true] yields the list t3f2. We ask Hansei to find all possible worlds in which the assertion holds and report the value of x in these worlds.

 reify_part None (fun() ->
  let x = a_blist () in
  let r = append (cons true (cons true nil)) x in
  if not (bl_compare r t3f2) then fail ();
  list_of_blist x);;
 
     - : (Ptypes.prob * bool list) list = [(0.0078125, [true; false; false])]

Since the search bound was None, the returned single result came from the exhaustive enumeration of the entire search space. The search space is indeed infinite, but amenable to efficient exploration because most of the possible worlds are inconsistent with the assertion and could be rejected wholesale, without generating them. For example, when comparing an indeterminate list to an empty list, bl_compare returns the result after forcing a single choice. One can tell if a list is empty by checking for the first Cons cell. One does not need to know how many else elements there may be in the list. In general, if the lists are not equal, it can be determined by examining their finite prefix. Furthermore, if one of bl_compare's arguments has a fixed length (as is the case in our example: t3f2 is the list of 5 elements), the comparison finishes after forcing finitely many choices. The ability of bl_compare to deal with partially determined lists relates it to unification: unifying [X|Y] with the empty list fails without the need to instantiate X or Y, in effect performing the comparison with the infinite set of concrete lists.

We can likewise use append to check for, and remove, a given suffix. We can also split a given list in all possible ways, returning its prefixes and suffixes. If the list is finite, we obtain the finite number of answers, in Prolog:

 ?- append(X,Y,[t,t,t,f,f]).
     X = [],  Y = [t, t, t, f, f] ;
     X = [t], Y = [t, t, f, f] ;
     X = [t, t], Y = [t, f, f] ;
     X = [t, t, t], Y = [f, f] ;
     X = [t, t, t, f], Y = [f] ;
     X = [t, t, t, f, f], Y = [] ;
     false.

and, similarly, in Hansei:

 reify_part None (fun() ->
  let x = a_blist () in
  let y = a_blist () in
  let r = append x y in
  if not (bl_compare r t3f2) then fail ();
  (list_of_blist x, list_of_blist y));;
 
     - : (Ptypes.prob * (bool list * bool list)) list =
      [(0.000244140625, ([], [true; true; true; false; false]));
       (0.000244140625, ([true], [true; true; false; false]));
       (0.000244140625, ([true; true], [true; false; false]));
       (0.000244140625, ([true; true; true], [false; false]));
       (0.000244140625, ([true; true; true; false], [false]));
       (0.000244140625, ([true; true; true; false; false], []))]

The Hansei code follows the demonstrated pattern, of building a probabilistic model returning a pair of two `random' listsx and y, conditioned upon the asserted evidence that their concatenation is t3f2. In the above code, the OCaml variable x is used twice. Although it is bound to a generator, its two use occurrences refer to the same generated value -- thanks to letlazy in the definition of a_blist.

Our final example is last, relating a list to its last element. We contrast Prolog code

 last(E,L) :- append(_,[E],L).
 
 ?- last(E,[t,t,f]).
 E = f ;

to Hansei code:

 let last bl = 
  let x = a_blist () in
  let e = a_boolean () in
  if not (bl_compare (append x (cons (e ()) nil)) bl) then fail ();
  e;;
     val last : blist -> unit -> bool = <fun>
 
 reify_part None (fun() -> last t3f2 ());;
     - : (Ptypes.prob * bool) list = [(0.0009765625, false)]

There are still open questions. The following code produces the result with no restrictions on the search space:

 reify_part None (fun () ->
  let x = a_blist () in
  let y = a_blist () in
  if not (bl_compare x t3f2) then fail ();
  if not (bl_compare x y) then fail ();
  list_of_blist y);;
     - : (Ptypes.prob * bool list) list = [(2.384185791015625e-07, [true; true; true; false; false])]

A slightly modified code, with (bl_compare x y) first, although gives the same result, requires the restriction on the search depth. The search space can no longer be efficiently enumerated. Can we automatically reorder the comparison statements, to improve the search? Another open problem is the `occurs check.' Its absence in Hansei again necessitates the restriction on the search space. None of the two open issues are fatal: restricting the search by depth or by the remaining probability mass (effectively, by the probability of missing a solution if one exists) is always possible and even natural. It is nevertheless interesting to see if the occurs check and the unification of two unbound variables have a clear analogue in the variable-less case.