=========================================================================== CSC B63 Lecture Summary for Week 9 Winter 2007 =========================================================================== [[Q: denotes a question that you should think about and that will be answered during lecture. ]] -------------------- Breadth-First Search [ Section 22.2 ] -------------------- * Starting from a specified "source" vertex s in V, BFS visits every vertex v in G that can be reached from s, and in the process, constructs for each v a path from v to s with the smallest number of edges (a "BFS-tree" of the graph). BFS works on directed or undirected graphs: we describe it for directed graphs. * To keep track of progress, each vertex is given a "colour", which is initially white. The first time that a vertex is encountered, its colour is changed to gray, and once a vertex has been examined (we'll see what the difference is between "encountered" and "examined" in a second), its colour is changed to black. At the same time, for each vertex v, we also keep track of the predecessor (the parent) of v in the BFS tree, p[v], and we keep track of the "distance" of v to s (the number of edges from s to v), d[v]. * Intuitively, white vertices are "unknown" to BFS, black vertices are "known" and have been fully "explored" (i.e., BFS has encountered all their neighbours), and gray vertices are known but not fully explored: they represent the "frontier". The distinction between black and gray vertices is important: it's how BFS keeps track of which vertices to explore next so that it really is working in a "breadth-first" manner. In order to manage the gray vertices, BFS stores them in a queue so that they are dealt with in a first-in, first-out manner. BFS(G=(V,E),s) for all vertices v in V colour[v] := white d[v] := infinity p[v] := NIL end for initialize an empty queue Q colour[s] := gray d[s] := 0 p[s] := NIL ENQUEUE(Q,s) while Q is not empty do u := DEQUEUE(Q) for each edge (u,v) in E do if colour[v] == white then colour[v] := gray d[v] := d[u] + 1 p[v] := u ENQUEUE(Q,v) end if end for colour[u] := black end while END BFS * Look at the example in the textbook. * Each node is enqueued at most once, since a node is enqueued only when it is white, and its colour is changed the first time it is enqueued. In particular, this means that the adjacency list of each node is examined at most once, so that the total running time of BFS is O(n+m), linear in the size of the adjacency list. * We can show that at the end of BFS, d[v] is equal to the number of edges on a shortest path from s to v (i.e., a path with the smallest number of edges). -> proof is outlined in "BFS computes shortest-path proof" handout * Applications: - Computing single-source shortest paths / distance in an unweighted graph (i.e., finding a shortest path through a maze). - Discovering connected components in a graph. - Identifying bipartite graphs, finding a 2-colouring of a graph. - Used for traversing decision trees in artificial intelligence. ------------------ Depth-First Search [ Section 22.3 ] ------------------ * Just like for BFS, each vertex will be coloured white (when it hasn't been "discovered" yet), gray (when it's been encountered but its adjacency list hasn't been completely visited yet), or black (when its adjacency list has been completely visited). The philosophy of DFS is "go as far as possible before backtracking", so we will also keep track of two "timestamps" for each vertex: d[v] will indicate the discovery time (when the vertex was first encountered) and f[v] will indicate the finish time (when it's been completely visited). * In order to implement the "depth-first" strategy, it is very natural to write DFS recursively. (We could also use a stack instead of a queue to keep track of the vertices that remain to be examined, and write the algorithm iteratively.) Because DFS is commonly used to find information about the connected components of a graph, there is one more "twist" to the implementation: instead of being given a start vertex s as in BFS, the main DFS subroutine is called repeatedly on each unvisited vertex until all vertices have been visited. (Note that the same trick could be used with BFS to entirely visit each connected component of a graph.) * Depth-First Search algorithm: DFS(G=(V,E)) DFS-VISIT(G=(V,E),u) for each vertex v in V colour[u] := gray colour[v] := white time := time + 1 d[v] := infinity d[u] := time f[v] := infinity for each edge (u,v) in E p[v] := NIL if colour[v] == white then end for p[v] := u time := 0 (* global *) DFS-VISIT(G,v) for each vertex v in V end if if colour[v] == white then end for DFS-VISIT(G,v) colour[u] := black end if time := time + 1 end for f[u] := time END DFS END DFS-VISIT * Look at the example in the textbook. * As for BFS, since DFS-VISIT is only called on white vertices, and the vertices immediately become gray, DFS-VISIT is called at most once for each vertex. Also, for each vertex, we visit its adjacency list at most once, so the total running time is just like for BSF, Theta(n+m) (linear in the size of the adjacency list). * Note that DFS constructs a "DFS-tree" for the graph (or a "DFS-forest" if it is called on each connected component), by keeping track of a predecessor p[v] for each node v. For certain applications, we need to distinguish between different types of edges: - Tree Edges are the edges in the DFS tree. - Back Edges are edges from a vertex u to an ancestor of u in the DFS tree. - Forward Edges are edges from a vertex u to a descendent of u in the DFS tree. - Cross Edges are all the other edges that are not part of the DFS tree (from a vertex u to another vertex v that is neither an ancestor nor a descendent of u in the DFS tree). [[Q: All these types of edges can appear in a DFS-forest of a directed graph... but what about for an undirected graph? Can all these types of edges appear in a DFS-forest of an undirected graph?]] * It's possible to prove many interesting properties about the timestamps d[v] and f[v] maintained for each vertex, for example, they have "parenthesis structure", i.e., for all vertices u and v, either: - v is a descendant of u in the DFS-tree and [d[v],f[v]] is entirely contained within [d[u],f[u]], or - u is a descendant of v in the DFS-tree and [d[u],f[u]] is entirely contained within [d[v],f[v]], or - neither vertex is a descendant of the other and the intervals [d[u],f[u]] and [d[v],f[v]] are disjoint (i.e., they do not overlap). * White-path Theorem (Theorem 22.9): In a depth-first forest of a (directed or undirected) graph G = (V, E), vertex v is a descendant of vertex u if and only if at the time d[u] that the search discovers u, vertex v can be reached from u along a path consisting entirely of white vertices. Note: You can use any of the theorems given in class for your assignments, on the exams, etc. * Applications: - Discovering cycles in a graph. - Discovering connected components, and strongly connected components in a graph. - Topologically sorting a graph (i.e., ordering the vertices so that if there is a directed edge (u,v), then u <= v).