Michael Brudno  Lecture Notes 29-30           8/7-8/2000

Graphs 
------

A graph is a collection of nodes connected by edges. The edges can be
bi-directional or one way. If the edges are bi-directional (whenever
there is an edge from node A to node B there is also an edge from B to
A) we call the graph undirected. Otherwise it is directed. A graph is
strongly connected if from any node you can reach any other node.  For
directed graphs the "strongly" is dropped in favor of just "connected".

Not only can we have directed and undirected graphs, but we also have
weighted and unweighted ones. A weighted graph has some number (which
we call a weight) associated with each of the edges. You can think of
an unweighted graph as a graph where all the edge weights are 1.

Graphs are useful for many real-world applications. They can be used
to simulate networks, roads, neural networks and many other
things. This class only touches the surface on graphs, so those of you
who find this interesting should take 170.

Graph Implementation
----- --------------

The first and simplest way to implement graphs is using lists of pointers,
similar to the way we represented trees and other "linked" data structures.
This has the advantage of being memory-efficient, however answering the 
question "Are nodes X and Y connected by an edge?" may take some time: we
need to search through every edge leaving X to see if we see one to Y. 

A different way uses adjacency matrices. We can represent a graph as a
matrix of bits, where Aij is 1 iff nodes i and j are connected by an
edge.  This has the advantage of quick lookups, but you pay for this
with memory: a sparse graph  (if a graph with N vertices has O(N) edges it 
is called sparse), will require a table of size O(N^2). For complete graphs
(ones where all edges are present) or nearly complete ones there is no 
reason not to use them.

Exercise for the reader: If we have some graph G represented using an
adjacency matrix A, what is the meaning of A^2? A^k?

Breadth First Search
------- ----- ------

The first thing which we might want to do to a graph is to go through
all its nodes and to a certain operation on them. there are two main
methods to do this, both have their advantages and disadvantages. The
first discussed here will be breadth-first traversals. The following
is an algorithm for a breadth first traversal:

put starting node on the queue 
while the queue is not empty: 
   take the first node off the queue 
   mark it as having been traversed 
   put all of its unmarked children into the queue 

This algorithm will first consider all the nodes distance 1 from the
starting node. Then all of the ones at distance 2. If we have the
graph

    A  -- B --- E 
   /     / \     \ 
  /     /   \     \ 
 F --- G    C --- D 

Then the depth first traversal starting at node A and always picking
the alphabetically first child first will result in the nodes being
looked at in the following order: A, B, F, C, E, G, D.

Depth First Search 
----- ----- ------

The depth first traversal is very similar, except for it uses a stack
instead of a queue:

push the starting node on the stack 
while the stack is not empty: 
   pop the top node off the stack 
   mark it as having been traversed 
   push all of its unmarked children onto the stack 

This algorithm will completely study one branch of the graph before it
starts on any other. Doing a depth first traversal on the graph above
with the same conditions we get the following ordering: A, B, C, D, E,
G, F.  Node G will be considered before node F because it is a child
of P, and as a result will be higher up on the stack then F.

The standard search in most applications is DFS. Although both work in time 
linear to the number of nodes and edges, DFS has a large advantage in
memory usage. Actually figuring out the space requirements for each of the 
searches is left as an (easy) exercise for the reader.

There are several other types of traversals, the most interesting of
which is IDDFS: Iterative Deepening Depth First Search. This works
just like DFS, but it is automatically cut off at some pre-set
depth. After which the search re-starts, but the depth at which it
is cut-off is increased. This turns out to be very useful in problems 
where one searches for a solution to a problem, such as games.

Dijkstra, an algorithm for finding shortest paths: 
--------
We will define the shortest path between nodes A and B as a collection
of edges which make up a path the sum of the weights of of which is
lower than the sum for any other path between the two nodes. If all of
the edge weights are 1 (that is the graph is unweighted) then the
algorithm is trivial: you just do a breadth first traversal starting at
one of the nodes between which you want to find the shortest path, and
end your traversal whenever you see the second node. The path which
you follow while on your way there will be necessarily the shortest.
However the problem is somewhat more complicated if there are edge
weights. Just because a certain path between the two nodes has the
smallest number of edges does not mean that it also is the shortest
path. We can have a path of one really "heavy" edge being more
expensive than three "light" ones. In order to find the shortest path
in a weighted graph with no negative edges (this is a requirement for
Dijkstra, but not for all such algorithms) we follow the following
algorithm:

push all the nodes of the graph into a priority queue, 
ordered by a certain variable dist, where dist of the 
starting note is 0, while the dist of all others is infinity. 
while the node for which we are looking has not been processed: 
   pop the top node (Q) off the priority queue 
   mark it as having been traversed 
   For all the unmarked children (P1...Pn)of the node, set 
   Pi.dist = min(Pi.dist, Q.dist + weight(Pi, Q)) 
   reorder the priority queue so it still maintains the heap property. 

Another way to think of Dijkstra's algorithm is as a breadth first
traversal, where we first consider the node the distance to which has
been smallest so far.

Let's do an example. given the graph 
       10     9 
    A  --- B --- E 
 3 /      / \     \ 1 
  /     4/  6\     \ 
 F ---- G     C --- D 
    2            5 

find the shortest distance between A and D. I will just quickly list
the order in which the elements are popped off the heap, and the
distances which get updated at each step:

initial: A.dist = 0, all the rest are infinity. 
pop(A) F.dist = 3, B.dist = 10 
pop(F) G.dist = 5 
pop(G) B.dist = 9 
pop(B) C.dist = 15, E.dist = 18 
pop(C) D.dist = 20 
pop(E) D.dist = 19 
pop(D) we are done, the shortest path is length 19. 

It is a matter of minor bookkeeping to actually recall what the path was. 

This algorithm works as long as there are no negative edges. If there
are, a different algorithm must be used. This algorithm is much simpler:
we keep the minimum distance so fat to all nodes (even those that we already
visited) and 


Minimum Spanning Trees
------- -------- -----

A spanning tree (ST) is such a collection of edges E' from a
connected graph C that if all edges beside those in E' were removed
from the graph, C would stay a connected. The minimum spanning tree (MST)
is the spanning tree with the lowest weight.

All of the algorithms which generate MSTs rely on one basic property:
Given some sub-tree of the final MST, the smallest edge leaving that 
subtree will always be included in the MST. 

Here we will present two algorithms to find MSTs:

Prim's Algorithm 
------ ---------
Prim's algorithm is actually a variation on Dijkstra's algorithm. The
idea is pretty simple: Whenever we wish to add a node to the MST we
add it using the shortest available edge connecting that node and a
node already in the MST. The algorithm looks like:
  
push all the nodes of the graph into a priority queue, 
ordered by a certain variable dist, where dist of the 
starting note is 0, while the dist of all others is infinity. 
while the node for which we are looking has not been processed: 
   pop the top node (Q) off the priority queue 
   mark it as having been traversed and add it to the MST using the edge used 
   to get to the node. 
   For all the unmarked children (P1...Pn)of the node, set 
   Pi.dist = min(Pidist, weight(Pi, Q)) 
   reorder the priority queue so it still maintains the heap property. 

Let's consider what updates/order of additions is done using the
already familiar graph

      10      9 
    A  --- B --- E 
 3 /      / \     \ 1 
  /     4/  6\     \ 
 F ---- G    C --- D 
     2           5 

Let's say we start at node C. 

pop(C) D.dist = 5, B.dist =6 
pop(D) E.dist = 1. 
pop(E) 
pop(B) G.dist = 4, A.dist = 10 
pop(G) F.dist = 2 
pop(F) A.dist = 3 
pop(A) 

and we are home free. The resulting MST looks as follows: 

  
    A      B     E 
 3 /      / \     \ 1 
  /     4/  6\     \ 
 F ---- G    C --- D 
         2                       5 

This algorithm depends on the fact that whenever you have two
sub-graphs with some number of edges running in between the two, the
smallest of these edges will always be part of the MST. This fact is
relatively easy to prove (the actual proof is left to the reader). So
we can just declare that a certain node will be part of the MST (that
we are guaranteed) and then follow the least edge from the node. After
we have looked at that child, you follow the least edge from one of
the two nodes, etc. The resulting tree is guaranteed to be minimum.

the running time for both Prim's algorithm and for Dijkstra's is the
same: (|V| + |E|) lg |V| because we look at all the vertices, and
reorder the heap at most once for each edge.

Krukskal's Algorithm 
---------- ---------
The classical algorithm for finding MSTs is using a union-find
structure. In Krukskal's algorithm we will first build MSTs for
subgraphs of the graph we want to build the MST of, and then unify
these subtrees together. In order to guarantee that each of the
subtrees are minimum spanning trees, and that we connect them using
the most efficient ways, we'll first order the edges by weight. Given
the by now repulsive graph

      10      9 
    A  --- B --- E 
 3 /      / \     \ 1 
  /     4/  6\     \ 
 F ---- G     C --- D 
    2            5 

lets find the MST using the second method. The following is the list
of edges in sorted order:

(E, D): 1, (F, G): 2, (A, F): 3, (B, G): 4, (C, D): 5, (B, C): 6, 
(B, E): 9, (A, B): 10 .

We will now look at each edge, add it to the MST and "unify" the two
sets between which the edge goes. If the edge connects two nodes
already in the same set then we just ignore it. The following lines
consist of the subsequent calls to union and the resulting sets.

edge      MST                              sets 
initial   { }                                      {A} {B} {C} {D} {E} {F} {G} 
(E, D)    {(E,D)}                                  {A} {B} {C} {D,E} {F} {G} 
(F, G)    {(E,D),(F,G)}                            {A} {B} {C} {D,E} {F, G} 
(A, F)    {(E,D),(F,G),(A,F)}                      {B} {C} {D,E} {F, G, A} 
(B, G)    {(E,D),(F,G),(A,F),(B,G)}                {C} {D,E} {F, G, A, B} 
(C, D)    {(E,D),(F,G),(A,F),(B,G),(C,D)}          {C,D,E} {F, G, A, B} 
(B, C)    {(E,D),(F,G),(A,F),(B,G),(C,D),(B,C)}    {C, D, E, F, G, A, B} 

Since now all of the nodes are in the same set we have created an
MST. It happens to be the exact same MST as using the previous
algorithm, but this is not always so. A graph can have many MSTs. It
is true that if a graph has no two same edge-weights then it has only
a single MST. This fact you do not have to know for this class, but
may come in handy later.

For the time analysis of this algorithm, it is clear that the sorting
takes |E| lg |E|. At first it looks as though the unification and
finding (determining to which set each node belongs to and unifying
them) can be expensive. However using a technique known as
path-compressions and a whole bunch of ammortized complexity analysis
we find out that M union-finds can be done in almost O(M) time: 

The algorithm is as follows: we will represent each subset as a tree of nodes,
and each node of the tree corresponds to a node of the graph. Each node is
represented as an element of an int array. If the node is a root of a tree 
of nodes, it will be negative, and the negation of the element will be the 
height of the tree. A positive element indicates a non-root, and stores the 
index of the parent of this node. Hence initially all of the elements are 
-1s: each node is in its own tree, and the height of all trees is one (well, 
actually zero, but who is counting?)

                        -------------------------------
                        |-1|-1|-1|-1|-1|-1|-1|-1|-1|-1|
                        -------------------------------
                          0  1  2  3  4  5  6  7  8  9

To find whether two edges are in the same set, we just traverse the pointers 
to their roots and see if we get to the same one. To unify two sets we 
just set the root of one of them to point to the other: 

The forest illustrated at left below is represented by the array at right.

          8        1        2                  ------------------------
         / \      /|\                          |1|-2|-1|8|5|8|1|3|-3|1|
        5   3    9 0 6                         ------------------------
        |   |                                   0  1  2 3 4 5 6 7  8 9 
        4   7

It should be clear that unifying two trees takes constant
time. Finding the root, though, is not as trivial. We must traverse a
whole bunch of pointers, so it is possible that we might get O(n)
time!  Here, however, the height of the tree comes in. Whenever we
unify two trees, we always make the shorter one point to the longer
one. This way the height is minimized, and lookups will take O(log n)
time: note that the number of nodes in the worst possible tree of
height n is the nth fibonacci number! Since those grow exponentially,
the height grows as the log of the number of nodes. We are not done
however!  We can do even better: when we could be doing more useful
work: making the nodes which are far away from the root point to the
root directly, rather than through a miriad of internal nodes. Hence a
find call on 7 in the above representation will make 7 point directly
to 8. This is known as path compression. The result is that the
running time becomes less than log log n, which is a constant for all
practical purposes.  To learn how to prove this bound take 170.

As a result the sorting time dominates the whole analysis and the
total running time is |E| lg |E|.

The Java code for union and find is:

public class DisjSets {
  private int[] array;

  public DisjSets(int numElements)  {
    array = new int [numElements];
    for(int i = 0; i < array.length; i++)
      array[i] = -1;
  }
  /* Unifys two roots */
  public void union (int root1, int root2) {
    if(array[root2] < array[root1])
      array[root1] = root2;  
    else if(array[root1] < array[root2])
      array[root2] = root1;
    else {
      if(array[root1] == array[root2]) {
        array[root1]--;      
        array[root2] = root1;
      }
    }
  }

  /* Perform a find with path compression. */
  public int find( int x ) {
    if(array[x] < 0)
      return x;
    else
      return array[x] = find(array[ x ]);
  }
}