TUTORIAL 12: Decision trees and lower bounds ============================================ This week the tutorial is about *problem* (not algo...) complexity and the Decision Tree Model. In class, I illustrated this by explaining: (a) how a comparison-based sorting algorithm can be modelled with the decision tree model, and (b) how to use this fact to derive the Omega(n log n)-comparisons lower bound for the sorting problem (for comparison-based sorting algorithms). You will: A) illustrate the use of the decision tree model to derive lower bounds for other problems than sorting (for comparison-based algos) B) show that sometimes the decision tree technique to determine a lower bound is not powerful enough to give you a *good* lower bound by looking at search on an *unsorted* array of n distinct elements. Decision tree example: --------------------- Draw the decision tree for binary search in a sorted array of size 5. (Assume that we search for some arbitrary key "key".) The algorithm for binary search is: low = 1 high = size(A) repeat mid = floor((high-low)/2) + low if A[mid] = key return mid else if A[mid] < key low = mid + 1 else if A[mid] > key high = mid -1 until (high < low) return "not found" The decision tree is as follows: (Let the students figure it out.) SPEND A LOT OF TIME ON THIS EXAMPLE! (It's hard to draw the tree in text, so here's some explanation of the tree: -rectangular boxes should be around comparisons (i.e., key:A[i]) -the high, low values are outside the nodes, and are only included to make computing the next level of the tree easier -each edge in the tree has one of the following labels on it: key=A[i], keyA[i] -return values should have ovals around them ) key:A[3] (high=5;low=1) / | \ key=A[3] / keyA[3] / | \ return 3 key:A[1] key:A[4] (high=2;low=1) (high=5;low=4) / | \ / | \ key=A[1] / keyA[1] key=A[4] | keyA[4] / | | | | \ return 1 return key:A[2] return 4 return key:A[5] "not found" (high=2; "not found" (high=5; low=2) low 4) / | \ / | \ key=A[2] / keyA[2] key=A[5] / keyA[5] Things to notice: 1) some leaves are duplicates (i.e., return "not found") 2) there are n+1 unique leaves (we can return each of the 5 keys, and "not found" if the value is not in the array 3) Each non-leaf node has three children Lower bound for comparison-based search algorithms ------------------------------------------------- For ANY comparison-based search algorithm A, we can prove a lower bound of \Omega(log n) as follows: the number of distinct outcomes is n+1 in the worst-case (i.e., when all values in the array are unique) -> the corresponding decision tree T_A has at least n+1 leafs there are at most three outcomes at any tree level (=, <, >). -> decision tree has height at least log_3 n (since a ternary tree of height h has at most 3^h leaves) Since log_3 n \in \Omega(log n), we have a lower bound of \Omega(log n). Aside #1: To show log_3 n \in \Omega(log n): let k= log_3 n <- take this to the power of 3 3^k = n <- take the log_2 of each side k log_2 3 = log_2 n <- substitute k= log_3 n and divide by log_2 3 log_3 n = log_2 n / log_2 3 -> log_3 n \in \Theta (log_2 n) Aside #2: For comparison-based algorithms, we can have the following outcomes (or labels on the branches): (<= , >) or (<, =>) or (<=, =>) or (<, =, >) or (=, \not =) In each case, there are at most 3 possible outcomes Lower bound for searching on an unsorted array ---------------------------------------------- Sometimes the decision tree technique to determine a lower bound is not powerful enough to give a good lower bound: In particular, consider any algorithm for searching on an unsorted array. The decision tree for an unsorted array has -n+1 distinct leafs -3 outcomes at each level (<, =, >) [not just (=, \neq) because we want our bound to apply to algorithms that compare values of elements too] -> the tree has height log_3 n -> we have a lower bound of \Omega(log n) BUT this lower bound is too low! Search on an unsorted array actually has a lower bound of \Omega(n) (Intuitively, if we only look at n-1 elements, we can't be sure if an element is not in the array or if it is the element we haven't looked at. This is a simple "adversary" argument.)