Haskell Types Part 1

Nested lambdas (“currying”) and nested function applications

lambda = anonymous function = how to write a function without giving it a name

Python:	`lambda x : x+1`
Javascript:	`function (x) { return x+1; }` `x => x+1`
Haskell:	`\x -> x+1`

\ was chosen because it looks like the Greek letter lambda: λ.

Next, if you intend 2 parameters, the Haskell culture is to implement it as a nested function:

\x -> (\y -> 2*x - 3*y)
-- those parentheses can be omitted

So, a function that maps the 1st parameter to a function that takes the 2nd parameter. This convention is known as “currying”.

Shorthand: \x y -> 2*x - 3*y

Recall from earlier

diffSq x y = (x - y) * (x + y)

This can be written as

diffSq = \x y -> (x - y) * (x + y)

or even

diffSq x = \y -> (x - y) * (x + y)

What about applying a function to 2 parameters:

diffSq 10 5

is shorthand for

(diffSq 10) 5

It is possible to use “diffSq 10” alone (“partial application”). Here is its meaning, by doing algebra:

  diffSq 10
= (\x -> \y -> (x - y) * (x + y)) 10
= \y -> (10 - y) * (10 + y)

Type-wise:

X -> Y -> A

is shorthand for

X -> (Y -> A)

Recall “function that maps the 1st parameter to function that takes the 2nd parameter”.

Haskell has tuple types. It is possible to do:

foo :: (Integer,Integer) -> Integer
foo (x,y) = (x - y) * (x + y)

But the Haskell community prefers currying.

Higher-order functions

And then it's natural ask about (X->Y)->A, i.e., what if a parameter is a function. Yes you can do that, and we say “higher-order functions”. Toy example: The following atFourPlusAtSeven function is a higher-order function:

atFourPlusAtSeven :: (Integer -> Integer) -> Integer
atFourPlusAtSeven f = f 4 + f 7

in HaskellTypes1.hs

Try using it like these:

atFourPlusAtSeven (\x -> x*2)
atFourPlusAtSeven (diffSq 10)    -- "diffSq 10" example of partial application
atFourPlusAtSeven (\x -> diffSq x 2)

Parametric polymorphism

I wrote a function to concatenate two integer lists:

appendIntegers :: [Integer] -> [Integer] -> [Integer]
appendIntegers [] ys = ys
appendIntegers (x:xt) ys = x : appendIntegers xt ys

in HaskellTypes1.hs

I have unnecessarily pinned the type to integer lists; it can't be used for boolean lists or others. But surely it shouldn't matter?

append :: [a] -> [a] -> [a]
append [] ys = ys
append (x:xt) ys = x : append xt ys

in HaskellTypes1.hs

Now it can be used for all kinds of lists (but the two input lists must have the same element type.)

Similar to {Java, Scala, Rust,...} generics and some C++ templates.

In “[a] -> [a] -> [a]”, the “a” there is a “type variable” or “type parameter”. Names of type variables are up to you, doesn't have to be “a”, but does have to start with lower case, e.g., element, myElementType.

(Type constants (names of built-in types and defined types) start with upper case, e.g., Integer, Bool, String.)

The type signature

append :: [a] -> [a] -> [a]

is better understood as having a universal quantifier:

append :: ∀a. [a] -> [a] -> [a]

This means the user (receiver) chooses which type to use for “a”, e.g.,

append [False] [True]: I'm choosing a=Bool
append [1] [2]: I'm choosing a=Integer

We say that append is “polymorphic”—can become one of many types: [Integer]->[Integer]->[Integer], [Bool]->[Bool]->[Bool], etc. appendIntegers is “monomorphic”—stuck with being one single type.

Important: The choice is up to the user, therefore not up to the provider. The provider cannot write:

appendWrong :: [a] -> [a] -> [a]
appendWrong [] ys = False : ys
appendWrong (x:xt) ys = True : x : appendWrong xt ys

because it is not up to the provider to choose Bool for “a”.

Rant: Rookies (of explaining things) try to explain by repeating “a can be any type” many, many times. This completely misses the point. No one ever doubts what a can be; the real question is who controls it. “Any” fails to explain why appendWrong is wrong (in fact can give the opposite impression); “up to the user, not up to the provider” is the explanation.

Flash quiz:

xyx :: a -> a -> [a]
xyx x y = [x, y, x]

in HaskellTypes1.hs

Is “xyx 4 False” legal?

“Parametric polymorphism”—the “parametric” part means: the provider is not told what the user chooses. Consequences:

Inflexible: Cannot vary behaviour by type, e.g., (pseudocode shown):

facetious x =
  if x :: Integer, then
    if x>5, the answer is [x-1, x+1, x^2, x*x]
    else the answer is [x, x+3]
  else if x :: Bool, then
    the answer is [x, not x]
  else
    the answer is [x, x]

Exercise: Can you do this in Java generics?

Easy to test: Test at just one type and one value, and you can conclude for all types and all values. Because the implementer is forced to do a uniform thing. In other words…
Try to code up something of this type:
```
foo :: a -> [a]
```
Try to be crazy (but still legal code) so I can't guess your code.
But I just need you to answer one question: What do I get if I test foo False? Then I know what will happen to foo 48, foo "hello", etc.
This simplifies testing greatly. You will notice that Python libraries need much more test cases, Haskell libraries need much fewer (for polymorphic functions).

(Two sides of the same coin. Generally, flexibility for the implementer is in direct conflict with predictability for the user. Programming is a dialectic class struggle between the user and the implementer.)

Curried, higher-order, polymorphic example: `map`

map is a library function that uses all of the above! Its type, written in two ways:

map :: (a -> b) -> [a] -> [b]
-- or
map :: (a -> b) -> ([a] -> [b])

What it does by example:

  map ord ['a', 'b', 'c']
= [ord 'a', ord 'b', ord 'c']
= [97, 98, 99]

ord :: Char -> Int is from Data.Char (import Data.Char), gives the ASCII code of a character. So I'm choosing a=Char, b=Int. I get a list of Int's.

But let's up the game!

map (map ord) [['a', 'b', 'c'], ['x', 'y', 'z']] :: [[Int]]

Detailed type breakdown:

map  (map  ord)
           ^^^Char->Int
     ^^^^^^^^^^[Char] -> [Int]
^^^^^^^^^^^^^^^[[Char]] -> [[Int]]

inner map :: (Char -> Int) -> ([Char] -> [Int])
(I'm choosing a=Char, b=Int)
map ord :: [Char] -> [Int]
outer map :: ([Char] -> [Int]) -> [[Char]] -> [[Int]]
(I'm choosing a=[Char], b=[Int])
map (map ord) :: [[Char]] -> [[Int]]

What it does:

  map (map ord) [['a', 'b', 'c'], ['x', 'y', 'z']]
= [map ord ['a', 'b', 'c'], map ord ['x', 'y', 'z']]
= [[ord 'a', ord 'b', ord 'c'], [ord 'x', ord 'y', ord 'z']]
= [[97, 98, 99], [120, 121, 122]]

Type-specific behaviour preview

A sorting algorithm can make sense for many element types. You don't want to restrict it to a monomorphic type. But this polymorphic type has a problem too:

insertionSort :: [a] -> [a]

because that type means not knowing the user-chosen type or even just how to compare. Note that comparison is a type-specific behaviour, though in a benign way, not the facetious way.

For a single function like sorting, a simple low-tech solution is to add an extra parameter for the comparator, and require the user to provide it. (In OOP languages, require an object that contains the comparison method.)

insertionSort :: (a -> a -> Bool) -> [a] -> [a]
insertionSort _ [] = []
insertionSort leq (x:xt) = insert x (insertionSort leq xt)
  where
    -- insert e xs = assuming xs is already sorted, insert e at "the right place"
    insert e [] = [e]
    insert e xs@(x:xt)
        | leq e x = e : xs
        | otherwise = x : insert e xt

exampleInsertionSort = insertionSort (<=) [3,1,4]

in HaskellTypes1.hs

There is a hi-tech solution in a later lecture. Teaser preview: It's called “type classes”, and it answers two questions:

How does Haskell support operator overloading? E.g., <= works for both Integer and String.
How to do, for example, a suite of polymorphic binary search tree operations, and ensure that insert, lookup, and delete use the same comparator consistently?

The hi-tech way leads to types like this:

insertionSort :: Ord a => [a] -> [a]

For now “Ord a =>” means the function can use comparisons like <= on values of type a.

User-defined types: Algebraic data types

Similar to Rust's enums and Scala's enums or case classes of sealed traits.

Disjoint union of cartesian products

First cut: 1 or more cases, each case 0 or more fields.

Imagine a feudal country Tetrastrata with four kinds of persons:

The monarch. 'Nuf said (has no field).
Lords. Each specified by 3 fields:
- title (string), e.g., "Duke", "Earl"
- territory (string), e.g., "Sussex"
- succession number (integer), e.g., the 9th Duke of Sussex has 9 here
Knights. Each specified by 1 field: name (a string)
Peasants. Each specified by 1 field: name (a string)

They can be represented as:

data Tetrastratan
    = Monarch
    | Lord String String Integer
    | Knight String
    | Peasant String
    deriving (Eq, Show)

in HaskellTypes1.hs

The type name is Tetrastratan.

Monarch, Lord, Knight, Peasant are called “data constructors”. If you think “tags”, you're right, I think that too.

All 5 names are your choice. Only requirement: capitalized.

Values of this type fall into 4 cases:

Monarch
Lord d t i, if d::String, t::String, i::Integer
Knight n, if n::String
Peasant n, if n::String

Warning: Not OOP constructors. It's only labelling, not arbitrary initialization code.

Warning: Not OOP subclasses/subtypes either. Monarch is not a subtype, it's a term and value. This part is different from Scala's case classes/objects.

Mathematically, Tetrastratan is a tagged disjoint union:
(singleton set) ⊎ String×String×Integer ⊎ String ⊎ String
analogous to a sum of products — polynomial — “algebraic”.

Data constructors of 1 or more fields can be used as functions. When you do it, their function types are:

Lord :: String -> String -> Integer -> Tetrastratan
Knight :: String -> Tetrastratan
Peasant :: String -> Tetrastratan

Monarch is not considered a function, but it is a constant, and its type is:

Monarch :: Tetrastratan

Examples of using them to make values:

theOneAndOnly,  edmondDantès :: Tetrastratan
theOneAndOnly = Monarch
edmondDantès = Lord "Count" "Monte Cristo" 1

mkDuke :: String -> Integer -> Tetrastratan
mkDuke terr n = Lord "Duke" terr n
-- Can also be cute, partial application:
-- mkDuke = Lord "Duke"

in HaskellTypes1.hs

Data constructors can be used in patterns. Pattern matching is the fundamental and preferred way of dissecting values of an algebraic data type. (Preferred over a lot of predicates “is it Lord?” and getters “get title” [reason].) Examples of functions doing this:

-- Pattern-matching example: How to address a Tetrastratan:
addressTetra :: Tetrastratan -> String
addressTetra Monarch = "H.M. The Monarch"
addressTetra (Lord d t i) = "The " ++ show i ++ "th " ++ d ++ " of " ++ t
addressTetra (Knight n) = "Sir/Dame " ++ n
addressTetra (Peasant n) = n

-- Pattern-matching example: Social order:
superior :: Tetrastratan -> Tetrastratan -> Bool
superior Monarch (Lord _ _ _) = True
superior Monarch (Knight _) = True
superior Monarch (Peasant _) = True
superior (Lord _ _ _) (Knight _) = True
superior (Lord _ _ _) (Peasant _) = True
superior (Knight _) (Peasant _) = True
superior _ _ = False

in HaskellTypes1.hs

Exercise: String for lord titles is too broad. Replace by an enumeration type (an algebraic data type in which data constructors have no fields). Everything is not a string.

Design guide: Design your data type to avoid invalid data in the first place. This reduces bugs. Not always possible, and when possible not always worthwhile, but you strive to. Disallow invalid data unless you can justify why it is not worthwhile. Read this article from Scala practitioners.

Some standard library types are algebraic data types; you could have defined them yourself, e.g., Bool:

data Bool = False | True

Recursive types

Second cut: Recursion (self, mutual) is supported.

Example: [singly linked] list of integers. B36-style recursive definition: A value of type MyIntegerList is one of:

INil
ICons x xs, if x::Integer and xs::MyIntegerList

data MyIntegerList = INil | ICons Integer MyIntegerList
    deriving (Show, Eq)

exampleMyIntegerList = ICons 4 (ICons (-10) INil)

-- `from0to n` builds a MyIntegerList from 0 to n-1
from0to :: Integer -> MyIntegerList
from0to n = make 0
  where
    make i | i >= n = INil
           | otherwise = ICons i (make (i+1))

myISum :: MyIntegerList -> Integer
myISum INil = 0
myISum (ICons x xs) = x + myISum xs

in HaskellTypes1.hs

Example: binary [search] tree of integers. B36-style recursive definition: A value of type IntegerBST is one of:

IEmpty
INode lt x rt, if lt::IntegerBST, x::Integer, rt::IntegerBST

data IntegerBST = IEmpty | INode IntegerBST Integer IntegerBST
    deriving Show

exampleIntegerBST = INode (INode IEmpty 3 IEmpty) 7 (INode IEmpty 10 IEmpty)

in HaskellTypes1.hs

BST insert. Since this is functional programming with immutable trees, “insert” means produce a new tree that is like the input tree but with the new key. May be better to say “the tree plus k”.

ibstInsert :: Integer -> IntegerBST -> IntegerBST
ibstInsert k IEmpty =
    -- Base case: empty tree plus k = singleton tree with k
    INode IEmpty k IEmpty

ibstInsert k inp@(INode left key right)
    -- Induction step: The input tree has the form INode left key right.
    -- Induction hypothesis (strong induction):
    --   If t is a smaller tree than the input tree, e.g., t=left or t=right,
    --   then ibstInsert k t computes t plus k.
    -- Can you use this to help compute inp plus k?

    -- If k<key, the correct answer is a node consisting of:
    -- new left child = left plus k = ibstInsert k left (by IH)
    -- new key = key
    -- new right child = right

    | k < key = INode (ibstInsert k left) key right

    -- If k>key, mirror image of the above.

    | k > key = INode left key (ibstInsert k right)

    -- If k==key, nothing new to compute, the correct answer is the input tree.
    | otherwise = inp   -- INode left key right

in HaskellTypes1.hs

Polymorphic types

Third cut: Parametric polymorphism (“generics” in Java, Rust, etc.) is supported.

Example: My list type of user-chosen element type. (Just for teaching purpose. You already know about the built-in list type.)

data MyList a = Nil | Cons a (MyList a) deriving (Eq, Show)

exampleMyListI :: MyList Integer
exampleMyListI = Cons 4 (Cons (-10) Nil)

exampleMyListS :: MyList String
exampleMyListS = Cons "albert" (Cons "bart" Nil)

myRev :: MyList a -> MyList a
myRev xs = myRevHelper xs Nil

myRevHelper Nil acc = acc
myRevHelper (Cons x xt) acc = myRevHelper xt (Cons x acc)

in HaskellTypes1.hs

The built-in list type is equivalent: [] is like Nil, x : xs is like Cons x xs.

Homogeneous list—can't have different item types in the same list, e.g.,

Cons "albert" (Cons True Nil)

is illegal. Because what would be its type, MyList String? MyList Bool?

Example: binary [search] tree of user-chosen key type.

data BST a = Empty | Node (BST a) a (BST a)
    deriving Show

exampleBSTChar :: BST Char
exampleBSTChar = Node (Node Empty 'c' Empty) 'g' (Node Empty 'j' Empty)

bstInsert :: Ord a => a -> BST a -> BST a
-- "Ord a =>" to be explained later under Type Classes. For now it means I can
-- do "<", "<=" comparisons.
bstInsert k Empty = Node Empty k Empty
bstInsert k inp@(Node left key right)
    | k < key = Node (bstInsert k left) key right
    | k > key = Node left key (bstInsert k right)
    | otherwise = inp

in HaskellTypes1.hs

Exercise: Rewrite BST and bstInsert for dictionary, not just set. I.e., not just key, but also value. Start like this:

data BST k v = Empty | Node ...

Some polymorphic algebraic data types from the standard library as further examples:

data Maybe a = Nothing | Just a
-- Great for: Sometimes there is no answer.

data Either a b = Left a | Right b
-- Great for: Having two possible types.
-- Example: Want a list with some String elements and some Bool elements:
-- [Left "albert", Right True, Left "trebla"] :: [Either String Bool]

Contents

Haskell Types Part 1

Nested lambdas (“currying”) and nested function applications

Higher-order functions

Parametric polymorphism

Curried, higher-order, polymorphic example: map

Type-specific behaviour preview

User-defined types: Algebraic data types

Disjoint union of cartesian products

Recursive types

Polymorphic types

Curried, higher-order, polymorphic example: `map`