This is an introduction to a (mathematical) theory of recursion that covers:
In addition to being of theoretical interest, it also helps with understanding code like
ns = 0 : map (+ 1) ns
fs = 1 : 1 : zipWith (+) fs (tail fs)
Although in theory you could apply lazy evaluation, it would be a big mess in this case. The theory here offers a more practical way, shown below. Reason: Finding the answer seldom needs following execution steps; there is always a higher-level way.
On the flip side, this high-level theory ignores time and space complexities; those would be the strong suit of following execution steps.
Every type is modelled as a set consisting of the usual values you expect, plus an additional element we write as “⊥” to stand for “undefined” or “unknown”, so that “f 5 is undefined” is expressible as “f 5 = ⊥”. Some types have additional elements; we will get there. People also pronounce ⊥ as “bottom”.
Beginners can interpret ⊥ as non-termination and/or runtime error. But eventually you should accept the fact that this theory only addresses “will there be an answer? if so, what is it?”, so ⊥ means only “no answer”. Since this theory does not address execution steps, “non-termination” does not have a meaning here.
(The full theory upgrades “set” to “set with partial order”. ⊥ is the lowest element in the partial order, hence “bottom”. The partial order justifies the magical methods in this lecture.)
Example: Integer is {⊥}∪ℤ.
Unsurprising without fields:
data Uno = MkUno
data Bool = False | True
data Trio = T1 | T2 | T3
The sets are respectively
{⊥, MkUno}
{⊥, False, True}
{⊥, T1, T2, T3}
More interesting when you have fields:
data E = C1 Bool | C2 Integer
The set is
{⊥} ∪ {C1 b | b ∈ {⊥, False, True}} ∪ {C2 i | i ∈ {⊥}∪ℤ}
Note that C1 ⊥ and C2 ⊥ are possible, in addition to ⊥. Idea: Knowing which of the two cases you are in is a little bit better than not even knowing.
data B2 = MkB2 Bool Bool
The set is
{⊥} ∪ {MkB2 b c | b∈{⊥, False, True}, c∈{⊥, False, True}}
In particular, we assert MkB2 ⊥ ⊥ ≠ ⊥. Idea: Some programs successfully output the constructor and just have problems with the fields; some other programs have so much problems they can’t even output that little.
Recursive types have the above plus infinitely long/big values, e.g., infinite lists. Using [Bool] for example:
data [a] = [] | a : [a]
-- equiv to: data List a = Nil | Cons a (List a)
Here are some of the extra elements, hopefully you can extrapolate the rest:
⊥
⊥ : ⊥
False : ⊥
⊥ : []
⊥ : ⊥ : ⊥
⊥ : ⊥ : []
⊥ : False : ⊥
⊥ : ⊥ : ⊥ : … infinite list of undefined Bools
False : ⊥ : True : ⊥ : … infinite list with some undefined Bools
False : False : False : … infinite list of False’s
In a function type X -> Y
, the ⊥ element is the
function that maps all inputs to ⊥.
Reminder: X and Y themselves have extra elements as described here,
e.g., Integer -> Bool
becomes {⊥}∪ℤ → {⊥, False, True}.
This already has a function that maps all of {⊥}∪ℤ to Bool’s ⊥. We
equate it with ⊥ of the function space, i.e., we assert ⊥ = (λx →
⊥).
Many expressions that have ⊥ as subexpressions can evaluate to non-⊥ answers. We have seen examples with algebraic data types, e.g., MkB2 ⊥ ⊥ is considered non-⊥. Many functions with ⊥ as input can give non-⊥ answers too:
const 5 ⊥ = (λx y → x) 5 ⊥ = 5
(λx → MkB2 x False) ⊥ = MkB2 ⊥ False
But here are the cases when ⊥ in subexpressions lead to ⊥ answers:
Function application but the function is ⊥: ⊥ x = ⊥
Arithmetic, e.g., ⊥ + 4 = ⊥, ⊥ == 0 = ⊥, even 0 × ⊥ = ⊥
Matching ⊥ against a data constructor pattern, e.g., (case ⊥ of [] → …) = ⊥
Note: Not when the pattern is just a variable: (case ⊥ of var → foo var) = foo ⊥
You see this most in pattern matching in functions. Recall that
map f [] = []
map f (x:xt) = x : map f xt
is shorthand for
map f xs = case xs of [] -> []
x:xt -> x : map f xt
So map f ⊥ = case ⊥ of … = ⊥.
if-then-else and guards, e.g. if ⊥ then x else y = ⊥
Evaluation order: In this theory, we are doing math and algebra, all evaluation orders give the same final answer.
This theory treats recursion by defining a sequence of successive approximations and then taking a “limit”.
If the recursive program goes like
foo = … foo …
then we define the sequence of approximations foo0, foo1, foo2,… by
foo0 | = ⊥ |
foon+1 | = … foon … |
Idea: fook is like setting the recursion depth quota to k; this is why foon+1 may only call foon because it costs 1 unit of your quota. When your quota runs out, you give up and use ⊥; this is why foo0 = ⊥.
Then we figure out what the sequence converges to, and that becomes the meaning of foo. Idea: Extrapolate to unlimited recursion depth.
Examples to show it in action:
Program:
zs = 0 : zs
Sequence:
zs0 | = ⊥ | |
zs1 | = 0 : zs0 | = 0 : ⊥ |
zs2 | = 0 : zs1 | = 0 : 0 : ⊥ |
zs3 | = 0 : zs2 | = 0 : 0 : 0 : ⊥ |
Note that zsn+1 is a compatible improvement upon zsn. I mean: Where zsn has a non-⊥ part, zsn+1 must agree (compatible); where zsn has a ⊥ part, zsn+1 may improve it to a non-⊥ value (improvement). This can be formalized and proved to hold for all programs.
“Limit” means the compatible minimal improvement over every zsk. The infinite list 0 : 0 : 0 : … does the job, so it is now what zs means. (“Minimal” means: don’t improve a ⊥ part to a non-⊥ value unless you have to.)
Equivalently, think of how zsk gives up after k nodes because it hits the recursion depth quota. If we allowed unlimited quota, it would keep going. So the answer is the infinite list.
These ideas can be formalized and proved to hold for all programs.
Program:
ns = 0 : map (\x -> x + 1) ns
Sequence:
Recall map f ⊥ = ⊥. This will help with working out the sequence.
ns0 | = ⊥ |
ns1 | = 0 : map (λx → x + 1) ⊥ |
= 0 : ⊥ | |
ns2 | = 0 : map (λx → x + 1) (0 : ⊥) |
= 0 : 1 : ⊥ | |
ns3 | = 0 : map (λx → x + 1) (0 : 1 : ⊥) |
= 0 : 1 : 2 : ⊥ | |
ns4 | = 0 : 1 : 2 : 3 : ⊥ |
The limit is the infinite list 0 : 1 : 2 : 3 : 4 : …
This is a famously confusing example in some Haskell tutorials, but we are now equipped to deconfuse it! Program:
zipPlus [] _ = []
zipPlus _ [] = []
zipPlus (x:xs) (y:ys) = x+y : zipPlus xs ys
tail (_:xs) = xs
fs = 1 : 1 : zipPlus fs (tail fs)
Sequence:
Note that fsn+1 is defined to be
fsn+1 = 1 : 1 : zipPlus fsn (tail fsn)
It can make multiple calls to fsn. The quota is on depth, not global count.
fs0 | = ⊥ |
fs1 | = 1 : 1 : zipPlus ⊥ (tail ⊥) |
= 1 : 1 : ⊥ | |
fs2 | = 1 : 1 : zipPlus (1 : 1 : ⊥) (1 : ⊥) |
= 1 : 1 : 2 : ⊥ | |
fs3 | = 1 : 1 : zipPlus (1 : 1 : 2 : ⊥) (1 : 2 : ⊥) |
= 1 : 1 : 2 : 3 : ⊥ | |
fs4 | = 1 : 1 : zipPlus (1 : 1 : 2 : 3 : ⊥) (1 : 2 : 3 : ⊥) |
= 1 : 1 : 2 : 3 : 5 : ⊥ |
The limit is the infinite fibonacci list.
Program:
x = 5 * x
Sequence:
x0 | = ⊥ | |
x1 | = 5 * x0 | = ⊥ |
x2 | = 5 * x1 | = ⊥ |
Stuck at ⊥, so converges to ⊥. This is why x = 5 * x
does not “solve” the equation the way you expect!
Program:
b && c = case b of False -> False
True -> c
x = False && x
Sequence:
x0 | = ⊥ | |
x1 | = False && ⊥ | = False |
x2 | = False && False | = False |
Converges to False.
Program:
f :: Integer -> Integer
f x = if x==0 then 0 else 1 + f (x - 2)
Sequence:
Reminder: The approximation method translates to:
f0 x | = ⊥ |
fn+1 x | = if x==0 then 0 else 1 + fn (x - 2) |
In this kind of work, it is best to work out each fn for all inputs before tackling fn+1, and write in a table format of inputs and outputs, or a math formula that summarizes the table. As opposed to working out say f2 10 alone, then later realize you also need f2 8 and despair.
Table:
x | ⊥ | … | -2 | -1 | 0 | 1 | 2 | 3 | 4 | … |
---|---|---|---|---|---|---|---|---|---|---|
f0 x | ⊥ | … | ⊥ | ⊥ | ⊥ | ⊥ | ⊥ | ⊥ | ⊥ | … |
f1 x | ⊥ | … | ⊥ | ⊥ | 0 | ⊥ | ⊥ | ⊥ | ⊥ | … |
f2 x | ⊥ | … | ⊥ | ⊥ | 0 | ⊥ | 1 | ⊥ | ⊥ | … |
f3 x | ⊥ | … | ⊥ | ⊥ | 0 | ⊥ | 1 | ⊥ | 2 | … |
Brief illustration of how to obtain some of those entries:
f0 x | = ⊥ | regardless of x |
f1 ⊥ | = if ⊥==0 … | |
= ⊥ | ||
f1 0 | = 0 | |
f1 x | = 1 + f0 (x-2) | for other x’s |
= 1 + ⊥ | ||
= ⊥ | ||
f2 ⊥ | = ⊥ | |
f2 0 | = 0 | |
f2 2 | = 1 + f1 0 | |
= 1 | ||
f2 x | = 1 + f1 (x-2) | for other x’s |
= 1 + ⊥ | ||
= ⊥ |
Formula:
fk x = | {⊥ | if x=⊥ or x<0 |
{x÷2 | if 0 ≤ x < 2k and x is even | |
{⊥ | if x ≥ 2k |
Remark: Since this is doing math, any common math notation or
description that gets your point across is OK. So my “if x=⊥” uses math
equal, not computer ==
, for example 0=⊥ is well-defined and
is false (whereas 0==⊥ is undefined, ⊥).
Extrapolating, the limit is
f x = | {⊥ | if x=⊥ or x<0 |
{x | if 0 ≤ x and x is even |
The set corresponding to a type is given a partial order. The notation is usually ⊑. I pronounce it “under”. (And confusingly overload it for all types/sets.)
Partial order means:
Difference from total order: Two values may be incomparable, may have both x ⋢ y and y ⋢ x.
Reason: Want to interpret x ⊑ y as: either they have the same information, or y has more information—a compatible improvement upon x. Hence people say “information order”.
Idea: foon+1 can give more information than foon because foon+1 works harder:
⊥ is interpreted as completely unknown, so ⊥ ⊑ everyone.
A program may output ⊥ : ⊥ under a low quota, and 0 : 1 : ⊥ under a higher quota. We will want
⊥ : ⊥ ⊑ 0 : (1 : ⊥)
But we want both 0 ⋢ 1 and 1 ⋢ 0 because: Suppose a program under a low quota outputs 0. It would be ridiculous if, under a higher quota, it changes its mind and outputs 1. And vice versa.
Information order of Integer: Smallest partial order that satisfies ⊥ ⊑ everyone.
Information order of an algebraic data type: Smallest partial order that satisfies:
⊥ ⊑ everyone
for a data constructor C with n fields: “pointwise order”: if x1 ⊑ y1 and … and xn ⊑ yn, then C x1 … xn ⊑ C y1 … yn
(this is needed for infinite lists/trees) if every finite approximation of x is ⊑ y, then x ⊑ y
Information order of a function type X→Y: “pointwise order”: f ⊑ g iff ∀x. f x ⊑ g x.
Theorem: For every recursive program foo,
foo0 ⊑ foo1 ⊑ foo2 ⊑ …
So the successive approximation method gives compatible improvements.
Definition: An ascending sequence is a sequence foo0, foo1, foo2,… that also satisfies
foo0 ⊑ foo1 ⊑ foo2 ⊑ …
Each set is required to have enough additional elements so that every ascending sequence has a limit. Precisely,
If foo0 ⊑ foo1 ⊑ foo2 ⊑ …, then it has a least upper bound.
(It is easy to prove that least upper bounds are unique.)
Notation (with dummy variable k): limk fook
This is why [Bool] needs infinite lists as additional elements, for example the sequence for zs needs one. Similarly for other recursive algebraic data types.
The set for a function type X -> Y
does not actually
have all functions in (the set for X) → (the set for Y) allowed by set
theory. We restrict it to continuous (limit-preserving) functions for
the theory here to work.
h is continuous iff: for every ascending sequence s, h (limk sk) = limk (h sk)
Theorem: All programmable functions are continuous.
Theorem: All continuous functions are order-preserving (if x ⊑ y, then f x ⊑ f y).
Definition: x is a fixed point of function h iff x = h x.
I.e., a solution of the equation x = h x.
There may be multiple solutions (or none), but we are interested in:
Definition: x is a least fixed point of h iff: x = h x, and x ⊑ every fixed point of h.
(It is easy to prove that least fixed points are unique.)
Example: x = 0 * x has two solutions, ⊥ and 0, but ⊥ is the least because ⊥ ⊑ 0.
Relevance: Recursive programs are fixed points. Examples:
-- zs = 0 : zs
h0 r = 0 : r
zs = h0 zs
-- ns = 0 : map (\x -> x + 1) ns
h1 r = 0 : map (\x -> x + 1) r
ns = h1 ns
-- f = \x -> if x==0 then 0 else 1 + f (x - 2)
hf r = \x -> if x==0 then 0 else 1 + r (x - 2)
f = hf f
Moreover, the computer can at most compute least fixed points, this
is why x = 0 * x
can only give you ⊥.
The following theorem backs the successive approximation method.
Theorem: If h is continuous, then limk sk is a least fixed point, where
s0 | = ⊥ |
sn+1 | = h sn |
I used simpler (but possibly weaker) definitions than the standard ones. In the standard definitions:
Each set is required to have least upper bounds for not just ascending chains, but also non-empty totally-ordered subsets.
Accordingly, continuous functions preserve least upper bounds of non-empty totally-ordered subsets.
I still have to find out how much the difference matters.
The Haskell Wikibook has a chapter on denotational semantics.
Introduction to lattices and order, 2ed, by Davey and Priestley.