===========================================================================
CSC 373H                Lecture Summary for Week 13             Winter 2006
===========================================================================

------------------------
Approximation algorithms
------------------------

Weighted Set Cover:  See section 11.3 in text.

  - ASCII notation: "\/" for set union, "/\" for set intersection

  - Input:  U (universe of elements), subsets S_1,...,S_m subset of U
        with nonnegative integer weights w_1,w_2,...,w_m for each subset.
    Output:  Cover C subset of {1,2,...,m} such that
        \/_{i in C} S_i = U and SUM_{i in C} w_i is minimum.

  - Example:  U = {a,b,c,d,e},
            S_1 = {a,b},   w_1 = 2,
            S_2 = {a,c,d}, w_2 = 5,
            S_3 = {b,e},   w_3 = 1,
            S_4 = {c,d},   w_4 = 2.
        C = {1,2} is NOT a cover because S_1 \/ S_2 != U.
        C = {2,3} is a cover of weight w_2 + w_3 = 5+1 = 6.
        C = {1,3,4} is a cover of weight 2+1+2 = 5, which is minimum.

  - Greedy algorithm:
        // Select sets one by one to try to minimize weight and
        // maximize number of new elements covered, at the same time.
        C := {}    // cover
        R := U     // remaining elements (i.e., not yet covered)
        while R != {}:
            pick i such that w_i / |S_i /\ R| is minimal
                // this minimizes "weight per new element covered"
            C := C \/ {i}
            R := R - S_i
        return C

  - Analysis:

      . After picking i in main loop, for each s in S_i /\ R,
        let c_s = w_i / |S_i /\ R|
        (c_s is "cost paid to cover s", used only in analysis).

      . By definition, each element covered during algorithm
        is accounted for by exactly one c_s so
            (11.9)  SUM_{i in C} w_i = SUM_{s in U} c_s.

      . But greedy algorithm might "overpay" for some sets, i.e.,
        SUM_{s in S_k} c_s > w_k; can we bound how much greater?

      . (11.10)  For all S_k, SUM_{s in S_k} c_s <= H(|S_k|) w_k
        (where H(n) = 1 + 1/2 + ... + 1/n = Theta(log n)).

        Proof:  Let d = |S_k| and S_k = {s_1, s_2, ..., s_d},
        in order of coverage by the algorithm.
          * when s_1 is first covered, set used is at least as good as S_k
            so c_{s_1} <= w_k/d (cost per element for S_k)
          * when s_2 is first covered, set used is at least as good as S_k
            so c_{s_2} <= w_k/(d-1) (cost/elem for S_k-{s_1})
          * ...
          * when s_j is first covered, set used is at least as good as S_k
            so c_{s_j} <= w_k/(d-j+1) (cost/elem for S_k-{s_1,...,s_{j-1}})
          * ...
        Total:
        SUM_{s in S_k} c_s <= w_k/d + w_k/(d-1) + ... + w_k/1 = H(d) w_k.

      . Let d* = MAX_{i=1..m} |S_i| (max size of S_i's), C* = optimum set
        cover, and w* = SUM_{i in C*} w_i = optimum weight.

      . (11.11)  SUM_{i in C} w_i <= H(d*) w* = Theta(log n) w*.

        Proof:
          * By (11.10), for each i in C*,
                w_i >= 1/H(|S_i|) * SUM_{s in S_i} c_s
                    >= 1/H(d*) * SUM_{s in S_i} c_s.
          * C* is a cover so
                SUM_{i in C*} SUM_{s in S_i} c_s >= SUM_{s in U} c_s.
          * Hence,
                w* =   SUM   w_i
                     i in C*
                                 1
                   >=   SUM    -----    SUM    c_s
                      i in C*  H(d*)  s in S_i
                        1                   1
                   >= -----   SUM   c_s = -----   SUM   w_i  (by 11.9)
                      H(d*)  s in U       H(d*)  i in C


---------------------
Randomized algorithms
---------------------
Randomization and probabilistic analysis appear in two distinct ways:

 - random input: do "average-case" analysis to study behaviour on typical input

 - randomized algorithm: use randomization to make random decisions while
   processing input
     eg. randomized quicksort

Some basic probability rules:
    Let A, B be events. 
    P(not A) = 1-P(A). 
    P(A or B) <= P(A) + P(B).
    If A and B are independent, then 
       P(A and B) = P(A) P(B). 
       If A implies B, then P(A) <= P(B). 

Contention Resolution (texbook section 13.1)
  - We have n processes, P_1, ..., P_n, trying to access one resource.
      (i.e., they can be trying to modify a common database)
    We divide time into discrete intervals (rounds).
    If two or more processes try to access simultaneously, all of them
      get locked out during that interval.

  - So trying to access as often as possible doesn't work. 

  - If the processes have a way of communicating with each other, they
    can easily assure that everyone waits at most n time units before
    getting the access. 

  - But what if they cannot communicate?
    The strategy: choose some probability p > 0.
    Each process attempts to access with probability p. 
      - randomization breaks the symmetry in the problem

  - What value of p should we choose?
    What happens if p is too high (close to 1)?
    What happens if p is too low (close to 0)?
    How should p depend on n?

  - Success probability: probability that the first (or any other) process
    succeeds at round 1.
      - We need the first process to write, while all others are not writing. 
      - P_success = p(1-p)^{n-1}
      - Using calculus, this probability is maximized when p=1/n.

  - Let p=1/n.
    P_success = p (1-p)^{n-1} = 1/n (1-1/n)^{n-1}
    This implies P_success approaches (1/n) (1/e), which is in Theta(1/n).
    Denote this probability by Ps(n). 
    Moreover, Ps(n) > 1/(n e).
    Up to a constant, this is the best we can hope for.

  - Waiting for one process to finish:
    What is the probability that some process is not able to write after
    t rounds?
    The probability to fail in one round is 1-Ps(n) < 1 - 1/(n e)
    So the probability to fail in t rounds is
      (1-Ps(n))^t < (1 - 1/(n e))^t = ((1 - 1/(n e))^{n e} )^{t/ne}
      				    < (1/e)^{t/ne}
    If t in Theta(n), can bound the probability by a constant. 
    If t in Theta(n log n), can bound the probability by Theta(1/n^c). 

  - Waiting for all processes to finish:
    Use union bound (equation 13.2 in textbook).
    Get that Theta(n log n) rounds suffice with very high probability. 
 
Primality Testing
  - A number n is prime if its only divisors are 1 and n. 
      Eg. 3, 7, 23 are prime, and 91=7x13 is not. 

  - Large prime numbers are needed in the execution of many cryptographic
    protocols. 
      Eg. the RSA protocol starts with two large primes p and q and relies
      on the hope that n=pq is hard to factor. 
  - We want to produce large (100s of digits) prime numbers at random, fast. 

  - Possible way: pick a big random number n, and test whether n is prime. 
    By density of prime numbers theorems, will need O(log n) steps to find
    a prime. 

  - We need a quick procedure to test whether n is a prime. 

  - Algorithm 1:
	For all k between 2 and root(n), check whether k divides n. 
	If YES, n is COMPOSITE. 
	If NO, n is PRIME. 

     - is this algorithm useful?

  - A property of prime numbers: (Fermat's Little Theorem)
    Let p be a prime, a be any number not divisible by p.
    Then a^{p-1} = 1 (mod p).
      Eg. 2^6 = 64 = 63+1 = 1 (mod 7). 

    Not true for non-primes in general:
      2^14 = 16384 = 16380+4 = 4 (mod 15). 
    Sometimes true: (such numbers are called pseudoprimes)
      14^14 = (-1)^14 = 1 (mod 15).
    Some composite numbers are pseudoprimes for all p (Carmichael numbers)
      561 = 3*11*17, but 561^{p-1} = 1 (mod p) for all primes p

  - We can use the property to test if n is a prime:
	Choose a random a, 1 < a < n
	Compute b = a^{n-1} (mod n)
	If b = 1, output PRIME
	If not, output COMPOSITE

  - What is the time required to compute b?

  - The algorithm could be wrong!
      If n is prime, we always output PRIME. 
      If n is composite, we might output either PRIME or COMPOSITE. 

  - Rabin-Miller Strong Pseudoprime Test: uses a similar but stronger property
  Theorem: algorithm misclassifies composite numbers with probability <= 1/4.

  - Improving performance: what is the probability the answer is wrong after
    2 rounds? Probability of failure is < (1/4)^2 = 1/16.

  - What is the probability of giving the wrong answer after k rounds?
	If the output in either round is COMPOSITE, output COMPOSITE. 
	If all are PRIME, output PRIME. 
      If n is prime, we always succeed. 
      If not, the probability of failure < (1/4)^k.
    Take k = 0.5 log n rounds for a failure probability of 1/n.

  - Deterministic primality testing:
    Recently Agrawal et al. [2002] found a deterministic algorithm for
    primality testing running in time polynomial in log n. 
      - More complicated than the simple test seen above. 
      - Thus PRIMES is in P.