Haskell Types Part 2

Type classes—operator overloading

How Haskell overloads operators and functions like ==, <=, + for various types, and how you can extend them to your own types, and how you can declare your own overloadable operations.

A “type class” declares a group of overloaded operations (“methods”).

Take == and /= for example. The type class is Eq from the standard library, == and /= are its methods:

class Eq a where
    (==), (/=) :: a -> a -> Bool

    -- default implementation for (==)
    x == y = not (x /= y)

    -- default implementation for (/=)
    x /= y = not (x == y)

    -- default implementations deliberately circular so you just have to
    -- implement one of them to break the cycle

To implement these methods for a type, e.g., the standard library has this for Bool:

instance Eq Bool where
    False == False    = True
    True == True      = True
    _ == _            = False

    -- default implementation for (/=) takes hold

We say “Bool is an instance of Eq”.

WARNING:

A class is not a type. Eq is not a type. These are illegal:
```
foo :: Eq -> Eq -> Bool
bar :: Eq a -> Eq a -> Bool
```
A type is not a “subclass”. Bool is not a “subclass” of Eq.

If you write a polymorphic function that uses a method, then its type will carry the corresponding class names to mark the fact that your function is only good for instances of the classes. (This also happens to the types of the methods themselves.)

-- Determine whether the 3 arguments are mutually equal.  Needs the user-chosen
-- type to be an instance of Eq (supports "==") but otherwise polymorphic.
eq3 :: Eq a => a -> a -> a -> Bool
eq3 x y z = x==y && y==z

in HaskellTypes2.hs

Another example: <=, <, etc. The class is called Ord:

class Eq a => Ord a where
    compare :: a -> a -> Ordering
    (<), (<=), (>), (>=) :: a -> a -> Bool
    max, min :: a -> a -> a

    -- there are default implementations; an instance just has to provide
    -- compare or <=

The “Eq a =>” near the beginning means: An instance of Ord is also an instance of Eq. This is best understood from two angles:

If you're making a type an instance of Ord, ensure that it is also an instance of Eq (write it yourself or import from someone else's code).
In type signatures, if you say “Ord a”, then you don't need to say “Eq a”, it's included.

We say “Ord is a subclass of Eq”, but beware that this is unrelated to OOP subclasses.

-- Sorting algorithms just need Ord but otherwise polymorphic.
insertionSort :: Ord a => [a] -> [a]
insertionSort [] = []
insertionSort (x:xt) = insert x (insertionSort xt)
  where
    insert e [] = [e]
    insert e xs@(x:xt)
        | e <= x = e : xs
        | otherwise = x : insert e xt

-- De-duplicating sorting uses both (==) and (<).  Type sig just needs Ord.
uniqInsertionSort :: Ord a => [a] -> [a]
uniqInsertionSort [] = []
uniqInsertionSort (x:xt) = insert x (uniqInsertionSort xt)
  where
    insert e [] = [e]
    insert e xs@(x:xt)
        | e < x = e : xs
        | e == x = xs
        | otherwise = x : insert e xt

in HaskellTypes2.hs

Recall we defined our own list types:

data MyIntegerList = INil | ICons Integer MyIntegerList
    deriving Show
data MyList a = Nil | Cons a (MyList a)
    deriving Show

in HaskellTypes2.hs

Let's make them instances of Eq so we can use == on them too:

instance Eq MyIntegerList where
    INil == INil              = True
    ICons x xs == ICons y ys  = x==y && xs==ys
    _ == _                    = False

instance Eq a => Eq (MyList a) where
-- The "Eq a =>" there means I need to use (==) on type a.
-- Try omitting it to see what happens.
    Nil == Nil              = True
    Cons x xs == Cons y ys  = x==y && xs==ys
                              -- "x==y" is when we need to assume Eq a.
    _ == _                  = False

in HaskellTypes2.hs

Check out these other type classes: Bounded, Enum (its methods are behind the [1..n] notation), Show, Read.

Reference: Haskell 2010 Section 6.3: Standard Haskell Classes

Often it is straightforward but boring to write instances for these classes, so the computer offers to auto-gen for you. Restrictions apply. You request it at the definition of your algebraic data type like this:

data MyType = ... deriving (Eq, Ord, Bounded, Enum, Show, Read)

For example the computer would write the same == algorithms above for MyIntegerList and MyList.

Reference: Haskell 2010 Chapter 11: Specification of Derived Instances

What other statically-typed languages do

C++

No change to the prototype:

template<T> SLL<T> insertionSort(SLL<T>)

But “template” is like macro. At call site, compiler has access to the insertionSort template code (to expand it right there), can see it uses <=, can check the user-chosen type supports it.

Works, but tight coupling and all its problems: defeats separate compilation, poor error messages.

There was a proposal called “concept” along the line of Haskell's type classes, but was not adopted.

Java, Scala, Rust, etc.

Type signature states that the type variable has to extend an interface:

<T extends Comparable<T>> SLL<T> insertionSort(SLL<T>)

Learned from Haskell type classes but adapted to OOP. (In fact, Java generics came from Phil Wadler who brought it from Haskell parametric polymorphism.)

Good de-coupling and all its benefits (Haskell type classes too):

Friendly to separate compilation:
When compiling insertionSort: Code is allowed to use comparison methods, does not need to know use sites.
When compiling use site: Compiler checks that user-chosen type extends Comparable, does not need to read insertionSort internal code, only needs to read type sig.
Good error messages at use site: Just say “user-chosen type doesn't extend Comparable”.

SML, Caml, OCaml

A module can take other modules as parameters. Real Template Methods.

Put insertionSort into a module—call it Sorter—but it takes as parameter another module that provides comparison operators.

(* Example in OCaml *)
module Sorter(Cmp : sig
                      type t
                      val leq : (t,t) -> bool
                    end)
= struct
    let rec insertionSort xs = ...
      (* insertion sort code; use Cmp.leq to compare *)
end

To use Sorter for ints, write a module that contains int comparison operators, use it to instantiate Sorter.

(* Example in OCaml *)
module IntCmp = struct
    type t = int
    let leq (x,y) = x<=y
end

module IntSorter = Sorter(IntCmp)
(* Now can use IntSorter.insertionSort for [int] *)

Similarly for other element types, even alternative orders.

Number types and classes

Standard library has Int, Integer, Rational, Float (single-precision floating point), Double (double-precision floating point), Complex a (a is Double or Float usually).

Number operations are grouped into several type classes, e.g.,

Num:
some methods: +, -, *, abs
instances: all number types

Integral:
some methods: div, mod
instances: Int, Integer

Fractional:
some methods: /, recip
instances: Rational, Float, Double, Complex a

You can add your own number type by making it an instance of the relevant classes. (E.g., you could have written your own Rational and Complex.)

Reference: Haskell 2010 Section 6.4: Numbers

Exercise: Why is this a type error?

let xs :: [Double]
    xs = [1, 2, 3]
in sum xs / length xs

Answer:
sum xs :: Double
length xs :: Int
(/) wants the two operands to be of the same type.

How to fix: sum xs / fromIntegral (length xs)
or use realToFrac

fromIntegral :: (Integral a, Num b) => a -> b
realToFrac :: (Real a, Fractional b) => a -> b

Example: Machine epsilon.

https://gist.github.com/AndrewBarfield/2557034 is forced to write two copies of the same code for two number types because C# is a stupid language.

This is criminal in this the 21st Century.

(But see https://twitter.com/mosheroperandi/status/856946180810354688 for even worse!)

A satisfactory language allows you to hand in just one copy:

epsilon :: (Ord a, Fractional a) => a
epsilon = last notTooSmall
  where
    halves = iterate (/ 2) 1
    notTooSmall = takeWhile (\e -> 1 + e > 1) halves

epsilonDouble :: Double
epsilonDouble = epsilon

epsilonFloat :: Float
epsilonFloat = epsilon

in HaskellTypes2.hs

Good type classes and bad type classes

A good type class has these traits:

You have multiple instances.
Methods satisfy useful laws or expectations, therefore can be used to build useful general algorithms.
Example: Ord: <= is reflexive, transitive, anti-symmetric, total. These laws are the basis of sorting algorithms and binary search tree algorithms.

Bad type class: Created for no further benefit than:

Common method name.
Procrastinating writing actual code, procrastinating making up your mind what to do.

True of any abstraction, e.g., OOP abstract classes and intefaces.

Foldable

The purpose of this section is twofold:

This explains why some of the library functions for lists have types like
```
length :: Foldable t => t a -> Int
```
instead of the simpler
```
length :: [a] -> Int
```
This familiarizes you with things like “Foldable t” and how it is not “Foldable (t a)”.

A few library functions that consume a list and compute a “summary”:

length :: [a] -> Int

sum :: Num a => [a] -> a

minimum :: Ord a => [a] -> a
-- assumes non-empty list

foldr :: (a -> b -> b) -> b -> [a] -> b

(Recall: “[a]” can be also written as “[] a”.)

These make sense for other data structures representing sequences too, not just linked lists. For example, these are also reasonable:

sum :: Num a => Vector a -> a
-- Vector is an array, 0-based Int index. Third-party library.

sum :: Num a => Seq a -> a
-- Seq is a middle ground between array and linked list, constant time
-- prepend and append, logarithmic time random access.

Haskell supports this generalization with a type class, but in a way you haven't thought of or seen in other languages. You knew from both Java and Haskell this kind of generalization:

from: [] Integer, [] Char, [] String
to: [] a

But now I'm telling you to do this generalization:

from: [] Integer, Vector Integer, Seq Integer
to: t Integer

With that, here is the type class:

class Foldable t where
    length :: t a -> Int
    sum :: Num a => t a -> a
    minimum :: Ord a => t a -> a
    foldr :: (a -> b -> b) -> b -> t a -> b
    -- and others

Important: It is not “class Foldable (t a)”. Likewise, instances go like “instance Foldable []”, not “instance Foldable ([] a)”.