=========================================================================== CSC B63 Lecture Summary for Week 11 Winter 2007 =========================================================================== -------------------------------- Minimum Spanning Tree algorithms (continued from last week) -------------------------------- [ finished MST notes from week 10 posting ] Running time of MST algorithms: - Kruskal (using disjoint sets with "union-by-rank" and path compression ): O(m log n) - Prim (using heap): O(m log n) - Prim (using binomial heap): O(m + n log n) ------------------------ Approximation algorithms [ chapter 35.2, not on exam ] ------------------------ [ most of this is a preview of courses like C73 or for interest, but it's a nice example of using our data structures and algorithms to solve a hard problem ] Some problems are hard to solve. eg. NP-complete problems: no one knows how to solve in worst-case polynomial time. - Perhaps a "good" answer (near-optimal) answer found efficiently is good enough. - "approximation ratio": an algorithm has approximation ratio r(n) if for any input of size n, C_alg / C_opt <= r(n) where C_alg is the "cost" of the solution found by the algorithm and C_opt is the "cost" of the optimal solution. - in some cases, ratio does not depend on n (constant approximation ratio) - "r(n)-approximation algorithm": an algorithm (usually polynomial time) that achieves a r(n) approximation ratio Travelling Salesperson Problem (TSP): - Input: complete graph G=(V,E) with non-negative (integer) edge costs c(e) - Output: a tour of G with minimum cost - a tour is a Hamiltonian cycle, that is, a simple cycle that visits every vertex in G - cost of a tour is the sum of edge costs in the tour - TSP is NP-complete in general - TSP is also hard to approximate - "triangle inequality": one side of a triangle is no more than the sum of the other two sides c(u,w) <= c(u,v) + c(v,w) - common when considering Euclidean or metric spaces - TSP with triangle inequality: also NP-complete, but often interesting - can approximate the solution within factor of 2 TSP with triangle inequality: - example: vertices are locations on a 6x5 grid vertices: a at (1,4), b at (2,1), c at (0,1), d at (3,4), e at (4,3), f at (3,2), g at (5,2), h at (2,0). edge costs: distance in Euclidean plane - lower bound: cost of a minimum spanning tree - an optimal tour minus one edge is a spanning tree - upper bound: 2*cost of a minimum spanning tree - we can traverse each edge of the MST twice to visit all the vertices - need to make this more precise to prove - example: an MST of example above is {(b,c), (b,h), (a,d), (d,e), (e,f), (e,g), (b,f)} - algorithm: pick a root vertex r compute a MST T from the root r L <- order of vertices in preorder walk of T return Hamiltonian cycle H produced by visiting vertices in order L (ignoring duplicated vertices) - running time: clearly polynomial time - Theorem: This is a 2-approximation algorithm for the TSP with triangle ineq. Proof: Let Opt be an optimal tour, let T be a MST. Deleting an edge from Opt yields a spanning tree, thus c(T) <= c(Opt). A full walk W of T lists vertices each time it is enountered in a preorder traversal of T. eg., abcbhbadefegeda The walk traverses every edge of T twice, so c(W) = 2*c(T). Combining with above inequality, c(W) <= 2*c(Opt). In general, W is not a tour, since vertices may appear multiple times. However, we can delete repeated vertices from W without increasing the cost (by triangle inequality). Repeat deleting repeated vertices until a valid tour H remains. Then c(H) <= c(W) <= 2*c(Opt), showing our approximation ratio.