=========================================================================== CSC 373H Lecture Summary for Week 8 Winter 2006 =========================================================================== Selection. [Section 13.5] - Given list A, rank k, return k-th smallest element in A. - Solution 1: sort A, return element at position k. Runtime: Theta(n log n). - However, can easily be faster for special cases (k=1: find min, k=n: find max, both doable in linear time). Would like linear time for arbitrary k. - Idea: no need to fully sort array to find k-th smallest. Given list A, rank k: . pick pivot element p; . partition A into B = [elements < p], C = [elements > p]; . if k = |B| + 1, return p; . if k < |B| + 1, return k-th smallest in B; . if k > |B| + 1, return (k-|B|+1)-th smallest in C. - Runtime depends on choice of pivot, like quicksort: . worst-case degenerates to Theta(n^2); . average-case is Theta(n) -- yields randomized algorithm with expected worst-case Theta(n). - Clever strategy can be used to get deterministic algorithm with worst-case performance Theta(n). Closest points. - Given points p_1 = (x_1,y_1), ..., p_n = (x_n,y_n), find a pair i,j such that d(p_i,p_j) is minimal (d(p,q) = distance). - Assumption: No two points with same x or y coordinate. (Can be eliminated but makes presentation simpler.) - Idea: Divide points into two halves, find closest pairs in each half, find closest pair across the two halves, and return closest overall. - Details: The input will consist of . P = set of points, . P_x = list of points sorted by x-coordinate, . P_y = list of points sorted by y-coordinate. - Divide: Split P horizontally (along x axis) into . Q = leftmost n/2 points, . R = rightmost n/2 points, . (lists Q_x, Q_y, R_x, R_y can be computed from P_x, P_y in linear time). - Recurse: Recursively find . q_0, q_1: closest points in Q, . r_0, r_1: closest points in R. - Combine: Let d = min( d(q_0,q_1), d(r_0,r_1) ). Need to determine if there are points q in Q, r in R with d(q,r) < d (in which case q,r are closest in P) or not (in which case q_0,q_1 or r_0,r_1 are closest in P). Consider any vertical line L that "splits" Q and R (i.e., with x-coordinate between rightmost point in Q and leftmost point in R). If there are points q in Q, r in R with d(q,r) < d, then q,r both lie within distance d of L (because d(q,r) < d means horizontal q-r distance also < d and so is distance to L). Fix line L and let S = points in P within distance d of L. S_x and S_y can be constructed in linear time from P_x, P_y. Fact: If points q in Q, r in R have d(q,r) < d, then q, r appear within 15 positions of each other in S_y! Proof: page 229 in textbook. - Algorithm: . construct P_x, P_y (time Theta(n log n)) . (p_0,p_1) := ClosestPairRec(P_x, P_y) ClosestPairRec(P_x, P_y): if |P| <= 3: find closest points by brute force else: construct Q_x, Q_y, R_x, R_y (time Theta(n)) (q_0,q_1) := ClosestPairRec(Q_x, Q_y) (r_0,r_1) := ClosestPairRec(R_x, R_y) d := min{ d(q_0,q_1), d(r_0,r_1) } m := average of rightmost x-coordinate in Q and leftmost x-coordinate in R S := points in P with x-coordinate within distance d of m construct S_x, S_y (time Theta(n)) for each s in S_y, compute distance to next 15 points in S_y and let (s_0,s_1) be closest pair found return closest of (q_0,q_1) or (r_0,r_1) or (s_0,s_1) ----------------------- Network Flow Algorithms ----------------------- Definition: a "network" is a directed graph N=(V,E) with - a "source" s in V with no incoming edge, - a "target" t with no outgoing edge (sometimes called "sink"), - a nonnegative integer weight (the "capacity") for each edge. - Example picture. Networks can be used to represent, e.g., computer networks (capacity = bandwidth), electrical networks, etc. Network flow problem: assign flow f(e) for each edge e such that we have maximum flow in the network, subject to: - capacity constraint: 0 <= f(e) <= c(e) (flow does not exceed capacity); - conservation constraint: for each vertex v != s,t, flow into v = flow out of v (flow into v = sum_{e in E-(v)} f(e); flow out of v = ...E+..., where E-(v) = in-edges, E+(v) = out-edges). - Flow for network N = flow out of s = flow into t (by conservation). Augmenting paths: - First idea: path P = s -> ... -> t where f(e) < c(e) for each e. Define "residual capacity" delta_f(e) = c(e) - f(e), and residual capacity delta_f(P) = MIN (delta_f(e) for e in P). Augment path by adding delta_f(P) to all edge flows. - Problem: notion too narrow, can get stuck with sub-optimal solution. (Example.) - Second idea: allow "backward" edges on path and re-define residual capacity of e is c(e) - f(e) if e is a forward edge on the path; it's f(e) if e is a backward edge. - Augmenting path = s-t path where each edge has positive residual capacity (i.e., c(e)-f(e) > 0 for forward edges e, f(e) > 0 for backward edges e). (A backward edge with positive flow represents extra flow that can be reassigned to forward edges.) - Augmentation: add delta_f(P) (defined as before) to forward edges, subtract it from backward edges. - Example. Ford-Fulkerson algorithm: start with any flow f (e.g., f(e) = 0 for all e in E) while there is an augmenting path P augment f using P output f