=========================================================================== CSC 363H Lecture Summary for Week 13 Summer 2006 =========================================================================== Log space (continued): - Example 2: PATH = { : G is a graph that contains a path from s to t} No known deterministic log-space algorithms, but easy nondeterministic log-space algorithm: store index of current vertex, start at s and nondeterministically select next vertex, accepting when t is reached; keep a counter of how many vertices we've visited, rejecting if visited more than n=|V| vertices. This only requires room to store one vertex index, O(log n), and one counter from 1 to n, O(log n). Corectness: there is some computation path that accepts iff there is some path from s to t in G. - L subset of NL, but NL subset of L unknown: Savitch's Theorem shows NL subset of SPACE((log n)^2), but that's all. - What about NL and P? Let's study NL-completeness. Defn: Language A is "logspace reducible" to language B (written A <=L B) if there is a function f : Sigma* -> Sigma* computable using a 3-tape TM with a read-only input tape, a write-only output tape and a read/write work tape such that for all w in Sigma*, w in A iff f(w) in B and the TM uses only O(log |w|) cells on the work tape. - PATH is NL-complete (w.r.t. L): . PATH in NL . for all A in NL, A <=L PATH, using "log space reduction" <=L Idea: The question "does w belong to A" is equivalent to "is there a path from the initial configuration to an accepting configuration in the computation tree of the nondeterministic log space TM for A". - Note: If A in L and B <=L A, then B in L. However, must be careful: output of log space reduction could take up more than log space. To get a log space algorithm for B, must use log space algorithm for A and recompute log space reduction each time, keeping only one output symbol at a time on work tape. - Since PATH in P, and A <=L B implies A <=p B (TM running in space O(log n) has at most n * 2^O(log n) possible configurations = O(n^k) for some constant k), NL subset of P. - L = NL? Unknown! P = NL? Unknown! However: NL = coNL! NL =/= PSPACE! Back to P vs. NP: - If P =/= NP, then there are problems in NP that are neither in P nor NPc, and there are infinitely many intermediate classes between P and NPc, with complexity that gets larger and larger. For example, the following language: GRAPH-ISOMORPHISM = { | G and H are two graphs that are isomorphic, i.e., there is a one-to-one and unto function f that maps vertices of G to vertices of H such that all corresponding edges are the same -- (u,v) is an edge of G iff (f(u),f(v)) is an edge of H } This is clearly in NP (a certificate is the function f, described as a list of pairs), but it is not known (or believed) to be in P, and it is not known (or believed) to be NP-complete. ----------------------------- Provably intractable problems ----------------------------- We've concentrated on P and NP because they are both defined as polytime and our interest was in efficient computation. More importantly, a vast majority of "real-life" problems that arise naturally from various application domains belong to NP. We have seen how to prove problems are NP-complete and why this is evidence that problems have no efficient solution. But are there problems that can be proved to have no efficient solution unconditionally? Definitions: EXP = U TIME(2^{n^k}) = { languages decided by TMs in time O(2^n^k) } NEXP = U NTIME(2^{n^k}) = { languages decided by TMs in nondeterministic time O(2^n^k) } EXPSPACE = U SPACE(2^{n^k}) = { languages decided by TMs in space O(2^n^k) } By Savitch's Theorem, NEXPSPACE = EXPSPACE. Known: L <= NL <= P <= NP <= PSPACE <= EXP <= NEXP <= EXPSPACE (using "<=" to represent "subset of") Unknown: L ?= NL, NL ?= P, P ?= NP, P ?= PSPACE, EXP ?= NEXP, EXP ?= EXPSPACE In other words, we don't know how to prove that nondeterminism makes a difference, and we don't know how to prove that space is more powerful than time. Known: NL != PSPACE, P != EXP, NP != NEXP, PSPACE != EXPSPACE In other words, we can prove that exponential gaps make a difference. Problems that are complete for EXP are known to be not in P, e.g., "generalized chess", "generalized checkers". Similarly, problems that are complete for EXPSPACE require more than polynomial space and are highly intractable, e.g., "inequivalence of regular expressions with squaring", "equivalence of regular expressions with exponentiation". How do we know these results? Because of so-called hierarchy theorems: for all real constants 0 <= c1 < c2, TIME(n^c1) is a proper subset of TIME(n^c2) and the same with SPACE. (See section 9.1 in the textbook.) But it doesn't stop there! We can define k-EXP, k-NEXP, k-EXPSPACE (deterministic time, nondeterministic time, space) 2^(2^...(2^n^t)...), where there are k exponentiations (so EXP = 1-EXP). Then, ELEMENTARY = 1-EXP U 2-EXP U ... = 1-EXPSPACE U 2-EXPSPACE U ... (since k-EXPSPACE is a subset of (k+1)-EXP). Problems complete for ELEMENTARY are decidable, but with such astronomical time or space bounds that they are completely intractable. Yet, "inequivalence of regular expressions with union, concatenation, and negation" requires running time 2^(2^(...2^(2^n)...)) where there are at least log n many exponentiations, so it's outside even ELEMENTARY! ---------------------------- Dealing with NP-completeness ---------------------------- NP important because it contains huge number of real-life problems that arise in various application domains. Vast majority of these problems belong either in P or are NP-complete. We know NP-complete problems do not have efficient solutions (unless P=NP) but this doesn't make real-life application go away. Example: VLSI circuit layout problem is NP-hard, but that doesn't mean we can forget about it; it must still be solved somehow! NP-complete means there is no exact, efficient algorithm. - Heuristics: compromise on efficiency -- some problems have algorithms that run in exponential time in worst-case but where worst-case does not seem to happen often in practice. - Approximation: compromise on exactness -- find efficient algorithm that may not return exact answer but something "close", e.g., instead of finding a k-clique, maybe will find a (k/2)-clique or k vertices that are "almost" a clique. - NP-completeness based on worst-case analysis: in practical applications, worst-case may not come up often (if at all). Average-case performance may be more indicative (but much harder to compute properly). - Alternatively, sometimes possible to work with restricted classes of inputs. For example, 2SAT is in P, UNARY-SUBSET-SUM is in P (and so is the problem if all input integers have a value bounded by some polynomial function of the input size), etc. In some cases, possible to prove performance results, e.g., "greedy by degree" graph colouring algorithm can be shown to produce an optimal colouring for all "co-graphs" -- graphs that have a certain property. For graphs that don't have this property, there is a gradual loss of optimality (i.e., graphs that are "close" to having the property will be coloured using "close" to the smallest number of colours), instead of a sharp rise. Heuristics are useful not just for problems in NPc, but also for problems in P whose running time is a high-degree polynomial (meaning 4 or above). Many practical applications deal with huge inputs (sizes 10^6 and above) where difference between n^2 and n^4 algorithms is significant. For example, the "linear programming" problem asks to minimize a linear function of some variables subject to a set of linear constraints on those variables. There is a polytime algorithm to solve linear programming, but its running time is a high-degree polynomial (something like n^6). Because linear programming is often used to model very large systems, this is not usable in practice; instead, the "simplex method" is often used. This algorithm has a worst-case running time that is exponential, but for most inputs encountered in practice, it does much better (including much better than the complicated polytime algorithm). ------ REVIEW ------ Main topics: - Computability . models of computation; robustness . diagonalization; countability/uncountability . decidability/recognizability; dovetailing . undecidability/unrecognizability; A_TM . many-one reductions (<=m); examples - Complexity . models of computation; P, NP, coNP . polytime reductions (<=p); Cook's theorem; NP-completeness . polytime self-reducibility . space complexity; PSPACE, L, NL; intractable problems - Reducibility (A <= B) is the central tool used. Understand it well! Final exam: you may bring one handwritten (not photocopied) US letter sized "cheet" sheet (both sides) to the exam.