Haskell Basics

Frame of mind

It is best to think of Haskell functions as math functions, not as C or Python functions; likewise, Haskell variables as math variables (placeholders, unknowns, constants), not as C or Python variables (mutable state).

Because of this, you will hear me say “f(4) equals/evaluates to 10” not “returns 10”, “codomain” not “return type”, “apply f to 4” (plugging in) not “call f”. Both angles are correct, you don't have to speak like me, but the algebra angle is better for beginners than the control-flow angle.

Defining values and functions

favouriteNumber :: Integer   -- Type signature. If omitted, the most general
                             -- type is inferred.
favouriteNumber = 4

isMultipleOf3 :: Integer -> Bool  -- Function type, "domain -> codomain"
isMultipleOf3 i = mod i 3 == 0

in HaskellBasics.hs

Most types can be omitted, in which case the computer infers the most general types.

It is still a good idea to write explicit types for important values and functions as a courtesy to your human readers. (It is OK to copy from the inferred type.)

Functions of multiple parameters

This example takes two integer parameters and gives an integer answer:

diffSq :: Integer -> Integer -> Integer
diffSq x y = (x - y) * (x + y)

in HaskellBasics.hs

How to use: diffSq 2 1

There is a bit more theory on types like X -> Y -> Z in a later lecture.

Local definition constructs

Two ways for local definitions:

diffSqV3a, diffSqV3b :: Integer -> Integer -> Integer

diffSqV3a x y =
    let minus = x - y
        plus = x + y
    in minus * plus

diffSqV3b x y = minus * plus
  where
    minus = x - y
    plus = x + y

in HaskellBasics.hs

The difference:

“let-in” is an expression, can be used wherever an expression is expected, e.g, 2 * (let x=5 in x+1)
“where” is not part of an expression, e.g., “2 * (x+1 where x=5)” is illegal. It is part of a definition, e.g., “y = ... where x=5” is legal. There is another place you can use “where”, shown in slowFib2 in the next section.

Basic example of branching and recursion

Haskell conditional branching includes if-then-else, and additionally pattern matching and guards (nicer syntax for if-then-elseif-etc).

-- if-then-else.
slowFactorial2 :: Integer -> Integer
slowFactorial2 n = if n == 0
                   then 1
                   else n * slowFactorial2 (n - 1)

-- Pattern matching.
slowFactorial :: Integer -> Integer
slowFactorial 0 = 1
slowFactorial n = n * slowFactorial (n - 1)

-- Slow Fibonacci (yawn) to show you can have more pattern matching.
-- "This Fibonacci joke is as bad as the last two you heard combined."
-- https://twitter.com/sigfpe/status/776420034419658752
slowFib :: Integer -> Integer
slowFib 0 = 0
slowFib 1 = 1
slowFib n = slowFib (n-1) + slowFib (n-2)

-- Long form pattern matching---using a "case-of" expression:
slowFib2 :: Integer -> Integer
slowFib2 x = case x of
    0 -> 0
    1 -> 1
    n -> slowFib2 n1 + slowFib2 n2
      -- The other place you can use "where". This one is part of "n -> ...".
      where
        n1 = n - 1
        n2 = n - 2

-- Using guards.
slowFib3 :: Integer -> Integer
slowFib3 n | n == 0    = 0
           | n == 1    = 1
           | otherwise = slowFib3 n1 + slowFib3 n2
  -- This "where" is part of the whole "slowFib3 n ...".
  where
    n1 = n - 1
    n2 = n - 2

in HaskellBasics.hs

Vocabulary: type, term, value

You will see something like this all the time:

f (x*2 + 1) :: Integer

This is notation for “the type of f (x*2 + 1) is Integer”. It is used in both actual code and my explanation prose.

Those two things are called:

f (x*2 + 1) :: Integer
^^^^^^^^^^^    ^^^^^^^
   term         type

One more example:

h . g :: Char -> Bool
^^^^^    ^^^^^^^^^^^^
term         type

“term” is also widely known as “expression”. I like “term” because it is shorter, and because “type” is also called “type expression”.

What about “value”? Answer: 5+4 is a term; the result of evaluating it, 9, is a value. But I am lax about this.

Lists (singly-linked)

Crash course on Haskell list syntax:

Types: [Integer], [Bool], etc.
Empty list: []
Lists of known items: [4, 1, 6], syntax sugar for 4 : (1 : (6 : []))

Those parentheses are optional, like this: 4 : 1 : 6 : []
The above syntax can be used in pattern matching.

Formally (recursive definition as in CSCB36): Consider [Integer] for example: the finite lists of Integers is the smallest set such that:

[] is a finite list of Integers.
If x is an Integer and xs is a finite list of Integers, then x : xs is a finite list of Integers.
List node, “cons cell” (from Lisp)

These are singly-linked lists. To reach the nth item (or even just the nth node), you need to take Θ(n) time to go through all the nodes from the beginning. These are not arrays.

Exercise: How much time does it take to get the length of a list? Why is length xs == 0 stupid?

([Integer] also has infinite lists. We will ignore them until a future lecture.)

Example: Concatenate two lists; or rather, produce a new list equal to the concatenation of the two input lists. (The standard library already has operator ++ for it, but let’s write our own version.)

append :: [Integer] -> [Integer] -> [Integer]
append [] ys = ys
append (x:xt) ys = x : append xt ys

in HaskellBasics.hs

If you don’t understand why it is correct, there is a tracing example below, and a general explanation and proof after. For now, it is an example of the list syntax.

Evaluation

Evaluation: How a computer or an enslaved student runs code.

It can be as simple and succint as plug-and-chug, like in math, because variables can’t be modified:

  slowFactorial 3
→ 3 * slowFactorial (3 - 1)
→ 3 * slowFactorial 2
→ 3 * (2 * slowFactorial (2 - 1))
→ 3 * (2 * slowFactorial 1)
→ 3 * (2 * (1 * slowFactorial (1 - 1)))
→ 3 * (2 * (1 * slowFactorial 0))
→ 3 * (2 * (1 * 1))
→ 3 * (2 * 1)
→ 3 * 2
→ 6

  append (1:2:[]) (3:[])
→ 1 : append (2:[]) (3:[])
→ 1 : 2 : append [] (3:[])
→ 1 : 2 : 3 : []

It is OK if you write “=” instead of “→”. I write “→” just to emphasize that the computer does only one direction.

Knowing how to run code has never helped you create code. Next I will show you how to create code.

Synthesis

Synthesis: How to create code. The hard part is recursion. I show you that it is easier if you don’t try to run it.

Pretend induction = Use induction to prove something that still contains unknowns, ah but during the proof you find out how to solve for the unknowns!

In my case, I’m proving “there is a recursive function f that does …” without knowing f upfront; during the proof, I discover what f should be.

Example: list concatenation

How I coded up append:

WTP: For all finite lists of Integers xs and ys: append xs ys = concatenation of xs and ys.

By structural induction on xs:

Base case:

WTP: For all ys: append [] ys = concatenation of [] and ys

Note that the answer is ys, so if I code up
```
append [] ys = ys
```
then the WTP holds.
Induction step:

Let Integer x, Integer list xt be arbitrary.
Induction hypothesis: For all ys: append xt ys = concatenation of xt and ys
WTP: For all ys: append (x:xt) ys = concatenation of x:xt and ys

(Idea and visualization before the general proof: Let’s say
x = 1
xt = 3:5:8:[]
ys = 4:1:6:[]
IH promises: append (3:5:8:[]) (4:1:6:[]) = 3:5:8:4:1:6:[]
WTP asks: append (1:3:5:8:[]) (4:1:6:[]) = 1:3:5:8:4:1:6:[]
That looks like “x : append xt ys”.)

General proof: Since
```
  concatenation of x:xt and ys
= x : concatenation of xt and ys    considering 1st item and the rest
= x : append xt ys                  by I.H.
```
so if I code up
```
append (x:xt) ys = x : append xt ys
```
then the WTP holds.

Comments:

The proof structure guides the code structure.
I refuse to imagine “what if I unfold append xt ys by hand?”. The I.H. already tells me the answer so I just use it. This helps me focus.
The catch: I need to make up my mind and carefully write down the specification—the WTP, what answer my function should give. I need the I.H. to be clear, not muddy.
You get both code and correctness proof. BOGO deal of the day! Why don't people do this more?!

Example: list reversal

Sometimes the recursion happens in a helper function because you need to introduce an accumulator parameter. A famous example is reversing a list. (The standard library has “reverse”, but let's learn how to do it.)

Note that although the following direct recursion is correct, it takes quadratic time. (Reminder: append xs ys takes time Θ(length of xs).)

rev [] = []
rev (x:xt) = append (rev xt) [x]

Here is a linear-time solution. It needs the help of an accumulator.

rev :: [Integer] -> [Integer]
rev xs = revhelper xs []

-- WTP: ∀xs, acc: revhelper xs acc = concatenation of xs reversed and acc
-- (The explicit "∀acc" is important for this problem.)
-- Example: revhelper (4:1:6:[]) (2:5:[]) = 6:1:4:2:5:[]
-- Use induction on xs.

-- Base case: xs is empty, answer is acc.
revhelper [] acc = acc

-- Induction step: Let x, xt be arbitrary.
-- IH: ∀acc: revhelper xt acc = concatenation of xt reversed and acc
-- Useful to note: IH holds for any acc you want, not just the "original" acc.
-- WTP: ∀acc: revhelper (x:xt) acc = concatenation of (x:xt) reversed and acc
--
-- Eureka:
--   concatenation of (x:xt) reversed and acc
-- = concatenation of xt reversed and (x : acc)
-- = revhelper xt (x : acc)
revhelper (x:xt) acc = revhelper xt (x : acc)

in HaskellBasics.hs