===========================================================================
CSC 373H                Lecture Summary for Week  3             Winter 2006
===========================================================================

MSTs (cont'd).

Recall red-blue rules algorithm for MST.

  To finish the proof, still need to show:
  - All edges in the graph are coloured
        Suppose this method "stops early"
	(i.e., there is an uncoloured edge e but no rule can be applied)
	By CI, blue edges forms a forest of blue trees (some trees might
	just be isolated vertices).
	If both ends of e are in the same blue tree,
	    the red rule applies to the cycle that would be formed
	    by adding e, contradiction.
	If the ends of e are in different blue trees, say T_1 and T_2,
	    the blue rule applies to the cut (T_1, V-T_1),
	    contradiction.
	Thus if any uncoloured edge remains, some rule must be applicable.


Remarks about greedy algorithms.

  - General form of problem:
      . Input: set of "candidates" C_1, C_2, ..., C_n, each one with weight
        (or cost) w(C_i).
      . Output: subset of candidates S subset of {C_1, C_2, ..., C_n}
        that satisfies certain constraints and such that total weight of S
        is maximal (or minimal).

  - General form of algorithm:
      sort candidates by weight
      S := {}
      for i := 1 to n:
          if S U {C_i} satisfies constraints:
              S := S U {C_i}
      return S

  - General form of correctness proof:
      . Partial solution S_i is "promising" if it can be extended to an
        optimal solution, i.e., if there exists S^opt_i such that S_i
        subset of S^opt_i and S^opt_i subset of S_i U {C_{i+1}, ..., C_n}.
      . Prove by induction that S_i is promising for 0 <= i <= n.
      . In proof, use "exchange lemma": if S_i is promising (with optimal
        solution S^opt_i) and S_{i+1} = S_i U {C_{i+1}}, then there exists
        S^opt_{i+1} that extends S_{i+1}.

  - Not all greedy algorithms fit this pattern exactly, but most are close.

Knapsack problems.

  - General problem: given a set of items, each with a weight w_i and value
    v_i, together with a fixed maximum capacity C (all numbers are positive
    integers), find a subset of items of maximal value whose total weight
    does not exceed the capacity.

  - Fractional knapsack problem:
      . Output: Fractions a_1, a_2, ..., a_n in [0,1] (amount of each item
            to take) such that SUM a_i w_i <= C and SUM a_i v_i is maximal.
    Greedy algorithms:
      . Largest weight first doesn't work.  Counter-example: C = 100,
        items = (100,100), (50,10), (50,10), ..., (50,10).
      . Largest value/weight first works: pick as much of each item as
        possible to fill up full capacity.  Proof: exercise (think of
        algorithm as choosing fraction of each item in turn, in order of
        value/weight).

  - 0-1 knapsack problem:
      . Cannot break up items, so output becomes subset S of {1,2,...,n}
        s.t. SUM_{i in S} w_i <= C and SUM_{i in S} v_i is maximal.
      . If value/weight ratio is constant, then greedy by value not
        guaranteed to produce optimal answer but gives approx. ratio 2
        (knapsack guaranteed to be at least 1/2 full).
      . If value/weight ratio is not constant, no greedy strategy works.

-------------------
Dynamic Programming  [Chapter 6]
-------------------

Matrix Chain Multiplication.

  - Reminder: matrix multiplication, associativity, complexity.

  - Given matrix chain product: A_0 A_1 ... A_{n-1}, many ways to
    parenthesize (e.g., A(BC) or (AB)C).  All will yield same answer but
    not same running time.  Example: A 1x10  B 10x10  C 10x100
        (AB)C = 1*10*10 + 1*10*100 = 100 + 1000 = 1100 ops
        A(BC) = 10*10*100 + 1*10*100 = 10000 + 1000 = 11000 ops

  - Matrix Chain Multiplication problem:
        Input: A_0, A_1, ..., A_{n-1} with dimensions
            [d_0 x d_1], [d_1 x d_2], ..., [d_{n-1} x d_n]
        Output: Fully parenthesized product with smallest total cost.

  - Brute force algorithm:
    How many possible ways to put in parentheses?  Answer is called
    "Catalan number" and is Omega(4^n).

  - Greedy algorithm:
      . Product with smallest cost first.
        Counter-example: 10 1 10 100
            greedy: 10 1 10 + 10 10 100 = 10100
            other:  1 10 100 + 10 1 100 = 2000
      . Product with smallest dimension last, or with largest dimension
        eliminated first.
        Counter-example: 1 10 100 1000
            greedy: 10 100 1000 + 1 10 1000 = 1,010,000
            other:  1 10 100 + 1 100 1000 = 101,000
      . Nothing works!

  - Structure of optimal subproblems:
      . Idea: instead of trying to find where to put first product, try to
        find where to put last product.
        A_0 (A_1 ... A_{n-1})       -- last product costs d_0 d_1 d_n
        (A_1 A_2) (A_2 ... A_{n-1}) -- last product costs d_0 d_2 d_n
            ...                            ...
        (A_0 ... A_{n-2}) A_{n-1}   -- last product costs d_0 d_{n-1} d_n
      . Greedy: take smallest last product?
        Counter-example: 1 10 100 1000
      . Only n-1 possibilities.  What information would help us find best
        answer?  Knowing best cost of doing each subproduct.
      . Note that best overall product must include optimal subproducts.

  - Definition of array of subproblem values:
      . N[i,j] = smallest cost of multiplying A_i ... A_j
      . From structure of optimal solution, best way of doing A_i ... A_j
        (including all parentheses) must have the form
            (A_i ... A_{k-1}) (A_k ... A_j)
        for some i < k <= j, where each subproduct A_i ... A_{k-1} and
        A_k ... A_j is done in the best way possible (otherwise
        wouldn't be best overall).

  - Array recurrence:
    From reasoning above, N[i,i] = 0 and for i < j,
    N[i,j] = min{ d_i d_k d_{j+1} + N[i,k-1] + N[k,j] : i < k <= j }

  - Expressing this as an algorithm?  Next time....