How Haskell overloads operators and functions like
==
, <=
, +
for various types, and how
you can extend them to your own types, and how you can declare your own
overloadable operations.
A “type class” declares a group of overloaded operations (“methods”).
Take ==
and /=
for example. The type class is
Eq
from the standard library, ==
and /=
are its methods:
class Eq a where (==), (/=) :: a -> a -> Bool -- default implementation for (==) x == y = not (x /= y) -- default implementation for (/=) x /= y = not (x == y) -- default implementations deliberately circular so you just have to -- implement one of them to break the cycle
To implement these methods for a type, e.g., the standard library has this
for Bool
:
instance Eq Bool where False == False = True True == True = True _ == _ = False -- default implementation for (/=) takes hold
We say “Bool
is an instance of Eq
”.
WARNING:
A class is not a type. Eq
is not a type. These are
illegal:
foo :: Eq -> Eq -> Bool bar :: Eq a -> Eq a -> Bool
A type is not a “subclass”. Bool
is not a “subclass”
of Eq
.
If you write a polymorphic function that uses a method, then its type will carry the corresponding class names to mark the fact that your function is only good for instances of the classes. (This also happens to the types of the methods themselves.)
-- Determine whether the 3 arguments are mutually equal. Needs the user-chosen -- type to be an instance of Eq (supports "==") but otherwise polymorphic. eq3 :: Eq a => a -> a -> a -> Bool eq3 x y z = x==y && y==zin HaskellTypes2.hs
Another example: <=
, <
, etc. The class is
called Ord
:
class Eq a => Ord a where compare :: a -> a -> Ordering (<), (<=), (>), (>=) :: a -> a -> Bool max, min :: a -> a -> a -- there are default implementations; an instance just has to provide -- compare or <=
The “Eq a =>
” near the beginning means: An instance
of Ord
is also an instance of Eq
. This is best
understood from two angles:
If you're making a type an instance of Ord
, ensure that it
is also an instance of Eq
(write it yourself or import from
someone else's code).
In type signatures, if you say “Ord a
”, then you don't
need to say “Eq a
”, it's included.
We say “Ord
is a subclass of Eq
”, but beware that
this is unrelated to OOP subclasses.
-- Sorting algorithms just need Ord but otherwise polymorphic. insertionSort :: Ord a => [a] -> [a] insertionSort [] = [] insertionSort (x:xt) = insert x (insertionSort xt) where insert e [] = [e] insert e xs@(x:xt) | e <= x = e : xs | otherwise = x : insert e xt -- De-duplicating sorting uses both (==) and (<). Type sig just needs Ord. uniqInsertionSort :: Ord a => [a] -> [a] uniqInsertionSort [] = [] uniqInsertionSort (x:xt) = insert x (uniqInsertionSort xt) where insert e [] = [e] insert e xs@(x:xt) | e < x = e : xs | e == x = xs | otherwise = x : insert e xtin HaskellTypes2.hs
Recall we defined our own list types:
data MyIntegerList = INil | ICons Integer MyIntegerList deriving Show data MyList a = Nil | Cons a (MyList a) deriving Showin HaskellTypes2.hs
Let's make them instances of Eq
so we can use ==
on
them too:
instance Eq MyIntegerList where INil == INil = True ICons x xs == ICons y ys = x==y && xs==ys _ == _ = False instance Eq a => Eq (MyList a) where -- The "Eq a =>" there means I need to use (==) on type a. -- Try omitting it to see what happens. Nil == Nil = True Cons x xs == Cons y ys = x==y && xs==ys -- "x==y" is when we need to assume Eq a. _ == _ = Falsein HaskellTypes2.hs
Check out these other type classes:
Bounded
,
Enum
(its methods are behind the [1..n]
notation),
Show
,
Read
.
Reference: Haskell 2010 Section 6.3: Standard Haskell Classes
Often it is straightforward but boring to write instances for these classes, so the computer offers to auto-gen for you. Restrictions apply. You request it at the definition of your algebraic data type like this:
data MyType = ... deriving (Eq, Ord, Bounded, Enum, Show, Read)
For example the computer would write the same ==
algorithms
above for MyIntegerList and MyList.
Reference: Haskell 2010 Chapter 11: Specification of Derived Instances
No change to the prototype:
template<T> SLL<T> insertionSort(SLL<T>)
But “template” is like macro. At call site, compiler has access to the
insertionSort
template code (to expand it right there), can see it uses
<=
, can check the user-chosen type supports it.
Works, but tight coupling and all its problems: defeats separate compilation, poor error messages.
There was a proposal called “concept” along the line of Haskell's type classes, but was not adopted.
Type signature states that the type variable has to extend an interface:
<T extends Comparable<T>> SLL<T> insertionSort(SLL<T>)
Learned from Haskell type classes but adapted to OOP. (In fact, Java generics came from Phil Wadler who brought it from Haskell parametric polymorphism.)
Good de-coupling and all its benefits (Haskell type classes too):
Friendly to separate compilation:
When compiling insertionSort
: Code is allowed to use
comparison methods, does not need to know use sites.
When compiling use site: Compiler checks that user-chosen type extends
Comparable
, does not need to read insertionSort internal code,
only needs to read type sig.
Good error messages at use site: Just say “user-chosen type doesn't
extend Comparable
”.
A module can take other modules as parameters. Real Template Methods.
Put insertionSort
into a module—call it
Sorter
—but it takes as parameter another module that provides
comparison operators.
(* Example in OCaml *) module Sorter(Cmp : sig type t val leq : (t,t) -> bool end) = struct let rec insertionSort xs = ... (* insertion sort code; use Cmp.leq to compare *) end
To use Sorter
for ints, write a module that contains int
comparison operators, use it to instantiate Sorter
.
(* Example in OCaml *) module IntCmp = struct type t = int let leq (x,y) = x<=y end module IntSorter = Sorter(IntCmp) (* Now can use IntSorter.insertionSort for [int] *)
Similarly for other element types, even alternative orders.
Standard library has Int
, Integer
, Rational
,
Float
(single-precision floating point), Double
(double-precision floating point), Complex a
(a
is Double
or Float
usually).
Number operations are grouped into several type classes, e.g.,
Num
:
some methods: +
, -
, *
, abs
instances: all number types
Integral
:
some methods: div
, mod
instances: Int
, Integer
Fractional
:
some methods: /
, recip
instances: Rational
, Float
, Double
,
Complex a
You can add your own number type by making it an instance of the relevant
classes. (E.g., you could have written your own Rational
and
Complex
.)
Reference: Haskell 2010 Section 6.4: Numbers
Exercise: Why is this a type error?
let xs :: [Double] xs = [1, 2, 3] in sum xs / length xs
Answer:
sum xs :: Double
length xs :: Int
(/)
wants the two operands to be of the same type.
How to fix: sum xs / fromIntegral (length xs)
or use realToFrac
fromIntegral :: (Integral a, Num b) => a -> b realToFrac :: (Real a, Fractional b) => a -> b
Example: Machine epsilon.
https://gist.github.com/AndrewBarfield/2557034 is forced to write two copies of the same code for two number types because C# is a stupid language.
This is criminal in this the 21st Century.
(But see https://twitter.com/mosheroperandi/status/856946180810354688 for even worse!)
A satisfactory language allows you to hand in just one copy:
epsilon :: (Ord a, Fractional a) => a epsilon = last notTooSmall where halves = iterate (/ 2) 1 notTooSmall = takeWhile (\e -> 1 + e > 1) halves epsilonDouble :: Double epsilonDouble = epsilon epsilonFloat :: Float epsilonFloat = epsilonin HaskellTypes2.hs
A good type class has these traits:
You have multiple instances.
Methods satisfy useful laws or expectations, therefore can be used to build useful general algorithms.
Example: Ord
: <=
is reflexive, transitive,
anti-symmetric, total. These laws are the basis of sorting algorithms and
binary search tree algorithms.
Bad type class: Created for no further benefit than:
True of any abstraction, e.g., OOP abstract classes and intefaces.
The purpose of this section is twofold:
This explains why some of the library functions for lists have types like
length :: Foldable t => t a -> Int
instead of the simpler
length :: [a] -> Int
This familiarizes you with things like “Foldable t
”
and how it is not “Foldable (t a)
”.
A few library functions that consume a list and compute a “summary”:
length :: [a] -> Int sum :: Num a => [a] -> a minimum :: Ord a => [a] -> a -- assumes non-empty list foldr :: (a -> b -> b) -> b -> [a] -> b
(Recall: “[a]
” can be also written as “[] a
”.)
These make sense for other data structures representing sequences too, not just linked lists. For example, these are also reasonable:
sum :: Num a => Vector a -> a -- Vector is an array, 0-based Int index. Third-party library. sum :: Num a => Seq a -> a -- Seq is a middle ground between array and linked list, constant time -- prepend and append, logarithmic time random access.
Haskell supports this generalization with a type class, but in a way you haven't thought of or seen in other languages. You knew from both Java and Haskell this kind of generalization:
from: [] Integer
, [] Char
, [] String
to: [] a
But now I'm telling you to do this generalization:
from: [] Integer
, Vector Integer
, Seq Integer
to: t Integer
With that, here is the type class:
class Foldable t where length :: t a -> Int sum :: Num a => t a -> a minimum :: Ord a => t a -> a foldr :: (a -> b -> b) -> b -> t a -> b -- and others
Important: It is not “class Foldable (t a)
”. Likewise, instances
go like “instance Foldable []
”, not “instance Foldable ([]
a)
”.