===========================================================================
CSC 373H                Lecture Summary for Week  7             Winter 2006
===========================================================================

------------------
Divide and Conquer
------------------

Integer multiplication.

  - Problem:  Multiply two integers x, y, given as sequences of bits
    x_0,x_1,...,x_{n-1}; y_0,y_1,...,y_{n-1} (low-order bit first, i.e.,
    x = x_{n-1} ... x_1 x_0 in binary, and similarly for y).

  - Iterative algorithm:  Multiply x by each bit of y, shifted
    appropriately, then add the n results to each other.
    Runtime = Theta(n^2) (n additions of up to 2n bits each).

  - Idea:  Let X_0 = x_{n/2-1} ... x_1 x_0 and X_1 = x_{n-1} ... x_{n/2},
    in binary (n can always be even by adding additional 0's to the left of
    x); define Y_0 and Y_1 similarly.
    Then, x = 2^{n/2} X_1 + X_0 and y = 2^{n/2} Y_1 + Y_0, and we can write

        x y = 2^n X_1 Y_1 + 2^{n/2} X_1 Y_0 + 2^{n/2} X_0 Y_1 + X_0 Y_0

    How does this help?  Original problem (compute x y) reduced to four
    subproblems of half size (compute X_1 Y_1, X_1 Y_0, X_0 Y_1, X_0 Y_0),
    together with some "shift" operations (multiplication by power of 2)
    and binary additions.

    This yields recursive algorithm directly.
        Multiply(x, y): // x, y are arrays of size n
            if n = 1:
                return x * y // multiplication of 1-bit numbers
            else:
                set arrays X1, X0, Y1, Y0
                p1 := Multiply(X1, Y1)
                p2 := Multiply(X1, Y0)
                p3 := Multiply(X0, Y1)
                p4 := Multiply(X0, Y0)
                return 2^n p1 + 2^{n/2} p2 + 2^{n/2} p3 + p4

    Runtime?  Recursive algorithm yields recurrence relation for worst-case
    runtime T(n):

        T(1) = Theta(1)
        T(n) = 4 T(n/2) + Theta(n)

    where 4 T(n/2) comes from the time spent executing the four recursive
    calls, and Theta(n) comes from the time spent performing shifts and
    binary additions.

    Closed form?

Master Theorem.

  - Let f be any nondecreasing function that satisfies the following
    recurrence, with constants integer a > 0, rational b > 1, real d >= 0,
    and integer k >= 1:

        f(n) = Theta(1)               if n <= k,
        f(n) = a f(n/b) + Theta(n^d)  if n > k.

    For example, f(n) could represent the runtime of a recursive algorithm
    that makes "a" recursive calls, each one to an input of size roughly
    n/b (ignoring floors and ceilings), in addition to taking time
    Theta(n^d) to perform work outside of the recursive calls.
    Then, f(n) has the following closed-form asymptotic solution:

        f(n) = Theta(n^{log_b a})  if a > b^d,
        f(n) = Theta(n^d log n)    if a = b^d,
        f(n) = Theta(n^d)          if a < b^d.

Integer multiplication (continued).

  - Master Theorem applies to T(n), with a = 4, b = 2, d = 1.  Since
    a = 4 > 2 = b^d, we have T(n) = Theta(n^{log_2 4}) = Theta(n^2).

    This is no better than simple iterative algorithm!

  - Trick: using same notation as before, notice that
    (X_1 + X_0) (Y_1 + Y_0) = X_1 Y_1 + X_1 Y_0 + X_0 Y_1 + X_0 Y_0.
    This is almost correct expression, except for shifts, and it involves
    only 1 multiplication instead of 4.  Because terms X_1 Y_0 and X_0 Y_1
    shift by same amount, we can use this to save one recursive call:

       x y = 2^n X_1 Y_1 + X_0 Y_0
               + 2^{n/2} ( (X_1 + X_0) (Y_1 + Y_0) - X_1 Y_1 - X_0 Y_0 )

  - This yields following recursive algorithm:

        Multiply2(x, y):
            if n = 1:
                return x * y // multiplication of 1-bit numbers
            else:
                set arrays X1, X0, Y1, Y0
                p1 := Multiply2(X1, Y1)
                p2 := Multiply2(X1 + X_0, Y_1 + Y0)
                p3 := Multiply2(X0, Y0)
                return 2^n p1 + 2^{n/2} (p2 - p1 - p3) + p_3

    with runtime T'(n) that satisfies:

        T'(n) = Theta(1)
        T'(n) = 3 T(n/2) + Theta(n)

    The constant hidden by the term Theta(n) is larger than for the first
    recursive algorithm (we perform more binary additions), but the Master
    Theorem still applies with a = 3, b = 2, d = 1, which yields
    T'(n) = Theta(n^{log_2 3}) = Theta(n^{1.58...}).

    This is strictly better than previous Theta(n^2)!

  - Picture of savings from 4 T(n/2) to 3 T(n/2).

  - In practice, best known algorithm is "Fast Fourier Transform" (FFT)
    algorithm -- a more complicated divide-and-conquer algorithm -- with
    runtime Theta(n log n log log n).


Merge sort.

  - Given array A:
     1. split into two halves B,C;
     2. sort B,C recursively;
     3. merge B,C back into A (in linear time).

  - Runtime:
        T(1) = Theta(1)
        T(n) = 2 T(n/2) + Theta(n)
    Closed-form: T(n) = Theta(n log n) (a = 2 = 2^1 = b^d).

Quick sort.

  - Given array A:
     1. pick "pivot" element p;
     2. partition A into B = [elements < p] and C = [elements > p],
        in place (i.e., so that A = [B,p,C]);
     3. sort B,C recursively, in place.

  - Runtime depends on pivot picked at each step:
      . worst-case degenerates to Theta(n^2);
      . average-case is Theta(n log n).

Counting inversions.

  - Given a_1,...,a_n, a permutation of [1,...,n], count the number of
    inversions (pairs i < j with a_i > a_j).

  - Application: comparative ranking.

  - Solution 1: Consider each pair.
    Runtime: Theta(n^2).

  - Solution 2: (divide and conquer)
     1. split input permutation A into halves B,C;
     2. count inversions in B,C recursively;
     3. count inversions between B,C, i.e., number of pairs of elements
        b in B, c in C such that b > c;
     4. return total number of inversions.

  - Problem: step 3 takes time Theta(n^2).

  - Solution 2': (make it possible to do step 3 in linear time)
     1. split input permutation A into halves B,C;
     2. SORT and count inversions in B,C recursively;
     3. MERGE and count inversions between B,C, i.e.,
        at each step of merge, comparing next element b in B, c in C:
          . if b < c then b < all remaining elements in C,
            so no more inversions involve b
          . if c < b then c < all remaining elements in B,
            so add remaining number of elements in B to inversion count
        (note: number of inversions between B,C not affected by sorting
        within each of B,C because all pairs b in B, c in C such that b > c
        remain the same);
     4. return total number of inversions.

  - Extra sorting makes it possible to count inversions between B,C in
    linear time (at the same time as merge step), and total time same as
    Mergesort: Theta(n log n).