foldr and foldl

foldr

One way to sum or multiply a list. This code is not for real use (the standard library has better versions), but to make a bigger point.

mySumV1 [] = 0
mySumV1 (x:xt) = x + mySumV1 xt

myProductV1 [] = 1
myProductV1 (x:xt) = x * myProductV1 xt

in HaskellFold.hs

What if you want to use yet another binary operator? Do we keep writing the same code again? Surely there is something to refactor out. One more example (less obvious):

-- This function is also in the standard library, under the name "map".
-- We write it again to make a bigger point.
-- E.g., myMap f [1, 2, 3] = [f 1, f 2, f 3]
myMap f [] = []
myMap f (x:xt) = f x : myMap f xt

-- Or imagine g = myMap f:
-- myMap f = g
--   where
--     g [] = []
--     g (x:xt) = f x : g xt

in HaskellFold.hs

The common theme is a function that looks like

g [] = z
g (x:xt) = a function of x and g xt

g = mySumV1:
    z = 0
    function: x + g xt --- the function is (+)

g = myProductV1:
    z = 1
    function: x * g xt --- the function is (*)

g = myMap f:
    z = []
    function:   f x : g xt
              = (\x r -> f x : r) x (g xt)  --- the function is \x r -> f x : r

All these are refactored to:

myFoldr op z = g
  where
    g [] = z
    g (x:xt) = op x (g xt)

The standard library has foldr for this.

More commonly we expand g back to myFoldr op z:

myFoldr :: (a -> b -> b) -> b -> [a] -> b
myFoldr op z [] = z
myFoldr op z (x:xt) = op x (myFoldr op z xt)

-- So here are mySumV1, myProductV1, myMap re-expressed as foldr's:
mySumFoldr xs = foldr (+) 0 xs
myProductFoldr xs = foldr (*) 1 xs
myMapFoldr f xs = foldr (\x r -> f x : r) [] xs

in HaskellFold.hs

The above shows one way to detect that you can use foldr. Below is another way using pretend induction to solve for op and z and get a complete proof! From Mastering foldr, by ertes. (Unfortunately the author died May 12, 2018. I knew him from IRC.)

I'll show working on myMap. Prove

myMap f xs = foldr op z xs

by pretend induction and find op and z.

Base case: xs = []

LHS = myMap f [] = []      by myMap code
RHS = foldr op z [] = z    by foldr code

Define z = [], then
LHS = [] = RHS

Induction step: xs = x:xt
Induction hypothesis: myMap f xt = foldr op [] xt
WTP: myMap f (x:xt) = foldr op [] (x:xt)
```
  LHS
= f x : myMap f xt         by myMap code

  RHS
= op x (foldr op [] xt)    by foldr code
= op x (myMap f xt)        by IH
```
Need op to do: op x (myMap f xt) = f x : myMap f xt
IMPORTANT TRICK: Generalize from myMap f xt to arbitrary r. op's code can't tell anyway.
Need op to do: op x r = f x : r
Define op to do just that! Or use (\x r -> f x : r) directly.
Then LHS = RHS by definition of op.

Advice: DO NOT THINK. DO NOT USE INTUITION. NEVER WORKED FOR PAST STUDENTS.

Instead, look for the common theme or do the pretend induction.

foldl

Another version of summing a list. This version is closer to the loop-and-accumulator you normally write.

The trick is to conjure a helper function for the loop, and it takes an extra parameter for the accumulator.

mySumV2 xs = g 0 xs
  where
    -- If you want to use induction to prove g, use this specification:
    -- for all xs, for all a, g a xs = a + sum of xs

    g accum [] = accum
    
    -- Induction step: The list is x:xt
    -- Induction hypothesis: for all a, g a xt = a + sum of xt
    --
    -- How to compute accum + sum of (x:xt)?
    --
    --   accum + sum of (x:xt)
    -- = accum + x + sum of xt
    -- = (accum + x) + sum of xt
    -- = g (accum + x) xt               by IH
    g accum (x:xt) = g (accum + x) xt

in HaskellFold.hs

Reverse a list. The standard library already has it, but again I'm writing my own for a purpose.

First, this is too slow. Why? (How much time does ++ take?)

slowReverse [] = []
slowReverse (x:xs) = slowReverse xs ++ [x]

(Answer: ++ takes linear time, slowReverse takes quadratic time.)

The trick is to conjure a helper function for a loop, with a LIFO stack accumulator (and singly-linked list is great for stack).

myReverse xs = g [] xs
  where
    -- If you want to use induction, the specification is:
    -- for all xs, for all a, g a xs = (reversal of xs) ++ a

    g accum [] = accum

    -- Induction step: The list is x:xt
    -- IH: for all a: g a xt = (reversal of xt) ++ a
    --
    -- How to compute (reversal of (x:xt)) ++ accum ?
    --
    --   (reversal of (x:xs)) ++ accum
    -- = (reversal of xt) ++ [x] ++ accum
    -- = (reversal of xt) ++ ([x] ++ accum)
    -- = g ([x] ++ accum) xt                    by IH
    -- = g (x : accum) xt
    g accum (x:xt) = g (x : accum) xt

in HaskellFold.hs

Common theme: a function g that looks like

g accum [] = accum
g accum (x:xs) = g (a function of accum and x) xs

mySumV2:
    initial accum: 0
    function: (+)

myReverse:
    initial accum: []
    function: (\a x -> x : a) = flip (:)

This is abstracted into:

myFoldl op z xs = g z xs
  where
    g accum [] = accum
    g accum (x:xt) = g (op accum x) xt

The standard library has foldl for this.

More commonly we expand g back to foldl op:

myFoldl op z [] = z
myFoldl op z (x:xs) = myFoldl op (op z x) xs

mySumFoldl xs = myFoldl (+) 0 xs

myReverseFoldl xs = myFoldl (flip (:)) [] xs

in HaskellFold.hs

Can you use pretend induction to solve for op and z? I tried, didn't help.

What happens here is that the biggest hurdle is conjuring the helper function g that takes an extra accumulator parameter. (Or perhaps the biggest hurdle is just deciding that you want this helper function.) Once you have that, z and op are easy to identify.