\input itemize
\def\title{Binary tree insertion}
\def\topofcontents{\null\vfill
\titlefalse % include headline on the contents page
\def\rheader{\hfil}
\centerline{\titlefont Inserting into an ordered binary tree}
\vfill}
@*Preliminaries.
Our object is to design a verified binary tree insertion routine.
To make our lives easier, we will employ some simplifications:
\itemize
\item We will use iteration instead of recursion.
\item We will use Dijkstra's language of guarded commands
\item We will assume the existence of a tree data type, such that
if |t| is a tree either |t| is empty (|t=emptyset|) or
|t| has a left subtree, a right subtree, and a datum
(|t=<>|). If |t=<>|, we may write |t.l| for |l|,
|t.d| for |d|, and |t.r| for |r|.
\item We will assume sequences; if |s| is a sequence then either |s|
is empty (|s=empty|) or |s| is an element followed by a sequence (|s=z
Z|).
\item We have a membership operator |member| that will test for
membership in sequences or trees.
\enditemize
@ We want to talk about ordered binary trees, so we'll want the
notion.
@c
ordered.emptyset == true;
ordered.<>==ordered.l & ordered.r &@|
(forall y:y member l: y<=d) &@| (forall y: y member r: y>=d);
@ Our mission is to insert a datum |x| into a binary tree |T| such that
the resulting tree is ordered.
We'll do this by creating a new tree |t| which is ordered and the
membership of which is the union of the membership of |t| with
$\{x\}$.
We don't care exactly what happens if |x| is already in the tree,
although we could add a postcondition |x member T ==> T=t|.
If we state our problem formally, we have
@c
{PRE: ordered.T}
{POST: (forall y::y member t <=> y member T | y=x) & ordered.t}
@ Let us develop some notion of ``insertion at a node.''
Perhaps we can use that as a guide to weakening the postcondition and
finding a loop invariant.
In essence what we will want to do is select some empty subtree of
|T|, and replace it with the tree |<>|.
Let us generalize, and imagine that we have a variable |s| that
represents |T| and points at some node of |T| ``to be replaced.''
Let |attach.t.s| denote the result of substituting |T| for the subtree
whose root is that node.
In particular if |s| points to the root of |T|, then |attach.t.s=t|.
@ How can we compute |attach|? We've already seen that if |s| points
to the root of |T|, then |attach.t.s=t|.
Suppose |s| does not point to the root of |T|.
Then there is an |s`| that points to the parent of the node pointed to by
|s|.
Then, there is some |t`| such that |attach.t.s=attach.t`.s`|, and,
furthermore, either |t`=<>| or |t`=<__>|, for some |u| and
|d|.
So, let |s| be a sequence of records, and let each record contain:
\itemize
\item A choice (|left| or |right|)
\item A datum |d|
\item A tree |u|
\enditemize
and define
@c
attach.t.empty == t;
attach.t.(<> s) == attach.<>.s;
attach.t.(<> s) == attach.<____>.s;
@ Now we can imagine our insertion problem as being broken into two
parts.
First, we find the right place to insert |x|.
This means computing an |s| such that
|attach.<>.s| is what we want.
Second, we have to compute |t| so that
|t=attach.<>.s|.
Let's imagine that we already have an |s|.
Then, we can change our postcondition by substituting |attach.t.s| for
|t|, and make that our loop invariant.
We then start out with |t=<>|, and write a loop
that adds to |t| while removing from |s|.
Since |s=empty| at the end of the loop, and since |attach.t.empty=t|, the
invariant conjoined with the negation of the guard gives us the
postcondition.
I leave it for the reader to show that the loop body leaves the
quantity |attach.t.s| unchanged.
@c
t := <>;
{invariant (forall y::y member attach.t.s <=> y member T | y=x) &
ordered.(attach.t.s)}
{bound #s}
do s != empty ->
let choice, data, tree, S satisfy <> S = s;
if choice = left ->
t,s := <>, S
[] choice = right ->
t,s := <>, S
fi
od
{POST: (forall y::y member t <=> y member T | y=x) & ordered.t}
@ It remains for us to compute a suitable |s| in the first part of the
program.
Let us write |X| for |<>|.
Then we must assign to |s|, establishing
$$\hbox{| (forall y::y member attach.X.s <=> y member T
|| y=x) & ordered.(attach.X.s)|.%
}$$
Since |y=x <=> y member X|, that is equivalent to
$$\hbox{
| (forall y::y member attach.X.s <=> y member T || y member X) &
ordered.(attach.X.s)|.}$$
Let us imagine we have a variable |t| such that |T=attach.t.s|.
Then we need
$$\hbox{|y member attach.X.s <=> y member attach.t.s || y member X|}$$
Now we make use of a property of |attach|, viz.\
|(forall X::y member attach.X.s <=> y member X || y member
attach.emptyset.s)|.
(The proof is by induction on the length of |s|.)
From this we can get $$ \hbox{
|t=emptyset ==> (forall y::y member attach.X.s <=> y member
attach.t.s || y member X)|.}\eqno(1)$$
This suggests the following code fragment:
@c
t,s := T,empty;
{invariant T=attach.t.s & ordered.(attach.X.s)}
do t != emptyset ->
/* loop body */@;
od
{t=emptyset & T=attach.t.s & ordered.(attach.X.s)}
/* which, by (1), implies */
{(forall y::y member attach.X.s <=> y member T | y=x) & ordered.(attach.X.s)}
@ The question is now what loop body will maintain the invariant.
Since in our earlier loop we made |t| larger and |s| smaller, we can
imagine inverting that loop to get the new loop.
We also invert the proof that the value of |attach.t.s| remains
unchanged.
@c
{invariant T=attach.t.s & ordered.(attach.X.s)}
do t != emptyset ->
if <<@tfirst guard@>>>@; ->
t,s := t.l, <> s
[] <<@tsecond guard@>>>@; ->
t,s := t.r, <> s
fi
od
@ The question remains as to what the guards must be to make the whole
thing work.
Taking the first branch, the weakest precondition of
|ordered.(attach.X.s)| is
@c
ordered.(attach.X.(<> s))
!<=>
ordered.(attach.<>.s)
!<= /* by a lemma to follow */
ordered.(attach.X.s) &
ordered.(attach.t.s) &
x <= t.d
!<=
ordered.(attach.X.s) &
ordered.T &
T=attach.t.s &
x <= t.d
@ So if we strengthen our invariant to include |ordered.T|, we can
make the first guard |x<=t.d|, and the second guard |x>=t.d|, giving
@c
{invariant T=attach.t.s & ordered.(attach.X.s) & ordered.T}
{bound depth.t}
do t != emptyset ->
if x <= t.d ->
t,s := t.l, <> s
[] x >= t.d ->
t,s := t.r, <> s
fi
od
{t=emptyset & T=attach.t.s & ordered.(attach.X.s)}
/* which, by (1), implies */
{(forall y::y member attach.X.s <=> y member T | y=x) & ordered.(attach.X.s)}
@ Now we have to finish the proof we started earlier.
We use a clever trick involving inorder traversals.
For reference, the inorder traversal is defined by
$$\eqalign{
|in.emptyset|&|==empty|\cr
|in.<>|&|==in.l d in.r|\cr
}$$
Suppose that
$$ (\forall s::( \exists l,r:: (\forall t::
|in.(attach.t.s)=l in.t r|))) \eqno (2)$$
(which we will show in just a moment).
Given |s|, choose such an |l| and |r|. Then we have the lemma we need:
@c
ordered.(attach.<>.s)
!<=> /* by (2) */
sorted.(l x t.d in.(t.r) r)
!<=
sorted.(l x r) &
sorted.(l in.(t.l) t.d in.(t.r) r) &
x <= t.d
!<=>
ordered.(attach.X.s) &
ordered.(attach.t.s) &
x <= t.d
@ We prove (2) by induction on the length of |s|.
If |s=empty|, then |l=empty| and |r=empty| satisfy |in.(attach.t.s)=l
in.t r|.
If |s!=empty|, then write |s=z Z|.
Our induction hypothesis is |(exists l`,r`:: (forall
t::in.(attach.t.Z)=l` in.t r`))|.
If |z=<>|, then let |l=l`| and |r=data in.tree r`|.
Then, for any |t|,
$$\eqalign{
in.(attach.t.s) &=
|in.(attach.t.(<> Z))|\cr
&=|in.(attach.<>.Z)|\cr
&=|l` in.t data in.tree r`|\cr
&=|l in.t r|,\cr}$$
and that's the induction step.
@*The finished program.
Here we put the whole program together:
@c
{PRE:ordered.T}
t,s,X := T, empty, <>;
{invariant T=attach.t.s & ordered.(attach.X.s) & ordered.T}
{bound depth.t}
do t != emptyset ->
if x <= t.d ->
t,s := t.l, <> s
[] x >= t.d ->
t,s := t.r, <> s
fi
od;
{(forall y::y member attach.X.s <=> y member T | y=x) & ordered.(attach.X.s)}
t := X;
{invariant (forall y::y member attach.t.s <=> y member T | y=x) &
ordered.(attach.t.s)}
{bound #s}
do s != empty ->
let choice, data, tree, S satisfy <> S = s;
if choice = left ->
t,s := <>, S
[] choice = right ->
t,s := <>, S
fi
od
{POST: (forall y::y member t <=> y member T | y=x) & ordered.t}
@*Index.
__