=========================================================================== CSC 373H Lecture Summary for Week 2 Winter 2006 =========================================================================== Activity Scheduling (cont'd). Recall greedy algorithm for Activity Scheduling: sort by finish time. Alternate proof using idea of "promising" solution and an exchange argument -- exchange activities from an optimal solution with our partial solution to show our partial solution is still "promising" - Let S_0, S_1, ..., S_n = partial solutions constructed by algo. at the end of each iteration. - Prove by induction on i (# iterations) that S_i is "promising", i.e., there is some optimal solution Opt_i that extends S_i using only activities from {A_(i+1),...,A_n} (S_i subset of Opt_i and Opt_i subset of S_i U {A_(i+1),...,A_n}). Note: Opt_i may not be unique (there may be more than one way to achieve optimal). . Base case: S_0 = {} so any optimal solution Opt_0 extends S_0 using only activities from {A_1,...,A_n}. . Ind. Hyp.: For some i >= 0, assume there is an optimal Opt_i that extends S_i using only activities from {A_(i+1),...,A)n}. . Ind. Step: To prove: S_(i+1) is promising w.r.t. {A_(i+2),...,A_n}. From S_i to S_(i+1), algo. either rejects or includes A_(i+1). Case 1: S_(i+1) = S_i This means A_(i+1) not compatible with S_i, so Opt_(i+1) = Opt_i extends S_(i+1) using only activities from {A_(i+2),...,A_n}. Case 2: S_(i+1) = S_i U {A_(i+1)} Opt_i may or may not include A_(i+1) so consider both possibilities. Subcase a: A_(i+1) in Opt_i Then Opt_(i+1) = Opt_i extends S_(i+1) using only activities from {A_(i+2),...,A_n}. Subcase b: A_(i+1) not in Opt_i We need to argue this can only happen if Opt_i contains some activity that can be "exchanged" with A_(i+1) to create new optimal solution Opt_(i+1). Then there must be A_j in Opt_i that overlaps A_(i+1) (otherwise, Opt_i U A_(i+1) would be better than optimal Opt_i). Also, j > i+1 because A_(i+1) is compatible with S_i. "Exchanging" these activities yields a new optimal solution that extends our partial schedule: Opt_(i+1) = Opt_i U {A_(i+1)} - {A_j} extends S_(i+1) using {A_(i+2),...,A_n} -- it contains same number of activities as Opt_i, and no overlap is introduced because f(i+1) <= f(j) (by sorting order). In all cases, there is optimal Opt_(i+1) that extends S_(i+1) using {A_(i+2),...,A_n}. - So each S_i is promising. In particular, S_n is promising w.r.t. {}, i.e., there is optimal Opt_n that "extends" S_n using activities from {}. In other words, S_n must be optimal itself. Minimum Spanning Tree. Input: Connected undirected graph G=(V,E) with positive cost c(e) > 0 for each edge e in E. Output: A spanning tree T subset of E such that cost(T) (sum of the costs of edges in T) is minimal. - Terminology: . "Spanning tree": acyclic connected subset of edges. . "Acyclic": does not contain any cycle. . "Connected": contains a path between any two vertices. A. Brute force: consider each possible subset of edges. Runtime? Exponential, even if we limit search to spanning trees of G. D. Boruvka's algorithm (1926): Idea: do steps like Prim's algorithm in parallel initially n trees (the individual vertices) repeat for every tree T, select a minimum-cost edge incident to T add all selected edges to the MST (causing trees to merge) until only one tree return this tree T Runtime? Analysis similar to merge sort. Each pass reduces number of trees by factor of two, so O(log n) passes. Each pass takes O(m) time, so total is O(m log n) Correctness? To come... B. Kruskal's algorithm (1956): // let m = |E| (# edges) and n = |V| (# vertices) sort edges by cost, i.e., c(e_1) <= c(e_2) <= ... <= c(e_m) T := {} // partial spanning tree for each v in V: MakeSet(v) // initialize disjoint sets for i := 1 to m: let (u,v) := e_i if FindSet(u) != FindSet(v): // u,v not already connected T := T U {e_i} Union(u,v) return T Runtime? Theta(m log m) for sorting; main loop involves sequence of m Union and FindSet operations on n elements which is Theta(m log n). Total is Theta(m log n) since log m is Theta(log n). Correctness? To come... C. Prim's algorithm (Jarnik 1930, Prim 1957, Dijkstra 1959): Idea: start with some vertex s in V (pick arbitrarily) and at each step, add lowest-cost edge that connects a new vertex. Proof: Might as well do at the same time as for Kruskal's E. Generalized MST algorithm: General greedy approach: build a spanning tree edge by edge, including appropriate "small" edges and excluding appropriate "large" edges. We can think of these algorithms as an edge-colouring process. - initially, all edges of the graph are uncoloured - one at a time colour edges either blue (accepted) or red (rejected) to maintain a "colour invariant" Colour Invariant: there is a MST containing all the blue edges and none of the red edges If we maintain this colour invariant and colour all the edges of the graph, the blue edges will form a MST! Terminology: . "cut": a vertex partition (X, V-X) . edge e "crosses" a cut if one end is in each side Rules for colouring edges: . Blue Rule: Select a cut that no blue edges cross. Among the uncoloured edges crossing the cut, select one of minimum cost and colour it blue. . Red Rule: Select a simple cycle containing no red edges. Among the uncoloured edges in the cycle, select one of maximum cost and colour it red. Note the nondeterminism here: we can apply the rules at any time and in any order. Correctness? What do we have to prove? Theorem: All the edges of a connected graph are coloured and the colour invariant is maintained in any application of a rule. To prove: The colour invariant is maintained. By induction on number of edges coloured. Initially, no edges are coloured, so any MST satisfies CI. Suppose CI true before blue rule is applied, colouring edge e blue. Let T be a MST that satisfies CI before e is coloured. If e in T, T still satisfies CI, done. If e not in T, consider the cut (X, V-X) used in the blue rule. There is a path in T joining the ends of e, and at least one edge e' on this path crossses the cut. By CI, no edge of T is red, and with blue rule, e' is uncoloured and c(e') >= c(e). Thus T - {e'} + {e} is a MST and it satisfies CI after e is coloured. Now suppose CI true before red rule is applied, colouring edge e red. Let T be a MST that satisfies CI before e coloured. If e not in T, T still satisfies CI, done. If e in T, deleting e from T divides T in 2 trees T_1 and T_2 partitioning G (thus (T_1,T_2) is a cut). Consider the cycle including e used in the red rule. This cycle must have another edge e' crossing cut (T_1,T_2). Since e' not in T, by CI and red rule, e' is uncoloured and c(e') <= c(e). Thus T - {e} + {e'} is a MST and it satisfies CI after e is coloured. To prove: All edges in the graph are coloured? Next time...