Most languages use “call by value” for evaluation order: To evaluate “f
(foo, bar)
”, evaluate foo
and bar
first (which
one first depends on the language), then plug into f
's body,
evaluate the body.
Example: If there is a function defined as f(x, y) = x:
f (3+4, div(4, 2)) eval a parameter, arithmetic → f (7, div(4, 2)) eval the other parameter, arithmetic → f (7, 2) ready to plug in at last → 7
A problematic parameter can cause an error/exception even if it would be unused:
f (3+4, div(1, 0)) eval a parameter, arithmetic
→ f (7, div(1, 0)) eval the other parameter, arithmetic
→ ☹
Let me mess with you first.
The standard library has a list function “take
” that behaves
like this:
take 3 [a,b,c,d,e] = [a,b,c] take 3 [a,b] = [a,b]
In general the first n items of the input list, or as many as available. The implementation goes like this:
take 0 _ = [] take _ [] = [] take n (x:xs) = x : take (n-1) xs
Do the following terminate?
doITerminate = take 2 (from 0) where from i = i : from (i + 1) doIEvenMakeSense = take 2 zs where zs = 0 : zs -- At low level this is one node pointing back to itself. O(1)-space.in HaskellEval.hs
Lazy evaluation in Haskell (sketch):
To evaluate “f foo bar
”: plug into f
's RHS
first, then evaluate that.
If RHS refers to the same parameter multiple times: same shared copy, no duplication.
If that runs you into pattern matching: evaluate parameter(s) just enough to decide whether it's a match or non-match. If match, plug into RHS and evaluate. If non-match, try the next pattern. (If run out of patterns, declare “undefined” aka “error”.)
To evaluate arithmetic operations e.g. “foo + bar
”:
call-by-value.
Very drastic example: The standard library has
const x y = x
Evaluation of const (3+4) (div 1 0)
:
const (3+4) (div 1 0) plug in → 3+4 arithmetic → 7
(No error about dividing by zero.)
Evaluation of doITerminate
(until the answer is fully
printed—e.g, image the REPL needs to print the full result):
take 2 (from 0) eval "from 0" for take's pattern matching → take 2 (0 : from (0 + 1)) match, plug in → 0 : take (2-1) (from (0 + 1)) eval "2-1" for take's pattern matching → 0 : take 1 (from (0 + 1)) eval "from (0 + 1)" for take's pattern matching → 0 : take 1 (i : from (i + 1)) match, plug in with i = 0 + 1
Note: My “with i = 0 + 1” means one shared copy of “0 + 1” referred to twice. You could also draw a graph: two pointers pointing to the same copy.
→ 0 : i : take (1-1) (from (i + 1)) eval 2nd element i for printing with i = 0 + 1 → 0 : i : take (1-1) (from (i + 1)) with i = 1
Note: “i = 1” is so simple, it's OK to get rid of it and simply write
0 : 1 : take (1-1) (from (1 + 1))
It's up to you.
eval "1-1" for pattern matching → 0 : i : take 0 (from (i + 1)) match, plug in with i = 1 → 0 : 1 : [] (may as well inline i if it appears only once)
This laziness is not just for the built-in list type. It holds for all functions and algebraic data types. For example, I could have done the same examples with my own list type defined last time:
data MyList a = Nil | Cons a (MyList a) deriving (Eq, Show)
Newton's method with lazy lists. Like in Hughes's «why FP matters». Approximately solve x3 - b = 0, i.e., cube root of b.
So f(x) = x3 - b, f'(x) = 3x2
x1 = x - f(x)/f'(x) = x - (x^3 - b)/(3x^2) = x - (x - b/x^2)/3 = (2x + b/x^2)/3
The local function “next” below is responsible for computing x1 from x.
cubeRoot b = within 0.001 (iterate next b) -- From the standard library: -- iterate f z = z : iterate f (f z) -- = [z, f z, f (f z), f (f (f z)), ...] where next x = (2*x + b/x^2) / 3 within eps (x : x1 : rest) | abs (x - x1) <= eps = x1 | otherwise = within eps (x1 : rest)in HaskellEval.hs
Equivalently, using the function composition operator (.)
cubeRoot = within 0.001 . iterate next
With this, you really have a pipeline like Unix pipelines.
Singly-linked list is a very space-consuming data structure (all languages). And if you ask for “the ith item” you're doing it wrong.
However, if you use list lazily in Haskell, it is an excellent control structure—a better for-loop than for-loops. Then list-processing functions become pipeline stages. If you do it carefully, it is even O(1)-space. If furthermore you're lucky (if the compiler can optimize your code), it can even fit entirely in registers without node allocation and GC overhead.
Thinking in high-level pipeline stages is both more sane and more efficient—with the right languages.
Some very notable list functions when you use lists lazily as for-loops, or when you think in terms of pipeline stages:
Producers: repeat, cycle, replicate, iterate, unfoldr, the [x..], [x..y] notation (backed by enumFrom, enumFromTo)
Transducers: map, filter, scanl, scanr, (foldr too, sometimes) take, drop, splitAt, takeWhile, dropWhile, span, break, partition, zip, zipWith, unzip
Consumers: foldr, foldl, foldl', length, sum, product, maximum, minimum, and, all, or, any
Sometimes you have to custom-make your own, of course.
And don't forget that list comprehension helps a lot too.
Define mySumV2
to sum a list using a helper function as a
while-loop and an accumulator parameter:
mySumV2 xs = g 0 xs where g accum [] = accum g accum (x:xs) = g (accum + x) xsin HaskellEval.hs
Evaluation of mySumV2 [1,2,3]
(similarly
foldl (+) 0 [1,2,3]
):
mySumV2 (1 : 2 : 3 : []) plug in → g 0 (1 : 2 : 3 : []) match, plug in → g (0 + 1) (2 : 3 : []) match, plug in → g ((0 + 1) + 2) (3 : []) match, plug in → g (((0 + 1) + 2) + 3) [] match, plug in → ((0 + 1) + 2) + 3 arithmetic at last → (1 + 2) + 3 ditto → 3 + 3 ditto → 6
This takes Ω(n) space for the postponed arithmetic.
Exercise: Go through the evaluation of mySumV1 [1,2,3]
(defined
below) or foldr (+) 0 [1,2,3]
to see how it also takes Ω(n) space,
though through a different journey.
mySumV1 [] = 0 mySumV1 (x:xs) = x + mySumV1 xsin HaskellEval.hs
Summing a list does not need lazy evaluation for the accumulator. There is
“seq
” for killing lazy evaluation where you deem it unsuitable.
To evaluate “seq x y
”: evaluate x
to “weak head
normal form”, then continue with y
.
Weak head normal form (WHNF) means:
for built-in number types: until you have the number
seq (4 + 5) foo → seq 9 foo → foo → ... steps for foo
for algebraic data types: until you have the data constructor
seq (take 5 (div 1 0 : bar)) foo → seq (div 1 0 : take (4-1) bar) foo → foo → ... steps for foo
for functions (we don't actually run into this case in this course): until you have a lambda
seq (if 5>4 then (\x -> x+1) else (\x -> x-1)) foo → seq (if True then (\x -> x+1) else (\x -> x-1)) foo → seq (\x -> x+1) foo → foo → ... steps for foo
Naturally, “seq x y
” is most meaningful when something is shared
between x
and y
, e.g., x
is part
of y
, and you request not to procrastinate evaluating that part.
Here is a summing algorithm using seq
to kill accumulator
laziness:
mySumV3 xs = g 0 xs where g accum [] = accum g accum (x:xs) = seq accum (g (accum + x) xs) -- Note: accum is something that "g (accum + x) xs" will need.in HaskellEval.hs
Sample evaluation showing much reduced build-up:
g 0 (1 : 2 : 3 : []) → seq 0 (g (0 + 1) (2 : 3 : [])) → g (0 + 1) (2 : 3 : []) → seq a (g (a + 2) (3 : [])) with a = 0 + 1 → seq a (g (a + 2) (3 : [])) with a = 1 → g (1 + 2) (3 : []) → seq b (g (b + 3) []) with b = 1 + 2 → seq b (g (b + 3) []) with b = 3 → g (3 + 3) [] → 3 + 3 → 6
Data.List
has foldl'
for this as well:
foldl' f z [] = z foldl' f z (x:xs) = seq z (foldl' f (f z x) xs)
It is true that the above still postpones arithmetic by one iteration and can be further hastened:
mySumV4 xs = g 0 xs where g accum (x:xs) = let a1 = accum + x in seq a1 (g a1 xs) -- Note: a1=accum+x is something that "g a1 xs" will need.in HaskellEval.hs
Sample evaluation showing that “assum + x
” gets evaluated
promptly.
g 0 (1 : 2 : 3 : []) → seq b (g b (2 : 3 : [])) with b = 0 + 1 → seq b (g b (2 : 3 : [])) with b = 1 → g 1 (2 : 3 : []) → seq c (g c (3 : [])) with c = 1 + 2 → seq c (g c (3 : [])) with c = 3 → g 3 (3 : []) → seq d (g d []) with d = 3 + 3 → seq d (g d []) with d = 6 → g 6 [] → 6