Report

P and NP Computational Complexity • Recall from our sorting examples at the start of class that we could prove that any sort would have to do at least some minimal amount of work (lower bound) – We proved this using decision trees (ch 7.8) – (the following decision tree slides are repeated) // Sorts an array of 3 items void sortthree(int s[]) { a=s[1]; b=s[2]; c=s[3]; if (a < b) { if (b < c) { S = a,b,c; } else { if (a < c) { S = a,c,b; } else { S = c,a,b; a,b,c }} } else if (b < c) { if (a < c) { S = b,a,c; } else { S = b,c,a; }} else { S = c,b,a; } } a<b b<c b<c a<c a,c,b c,a,b a<c b,a,c c,b,a b,c,a Decision Trees • A decision tree can be created for every comparison-based sorting algorithm – The following is a decision tree for a 3 element Exchange sort • Note that “c < b” means that the Exchange sort compares the array item whose current value is c with the one whose current value is b – not that it compares s[3] to s[2]. b<a c<a c<b b<a c,b,a c<a b,c,a b,a,c a<b c,a,b c<b a,c,b a,b,c Decision Trees • So what does this tell us… – Note that there are 6 leaves in each of the examples given (each N=3) • In general there will be N! leaves in a decision tree corresponding to the N! permutations of the array – The number of comparisons (“work”) is equal to the depth of the tree (from root to leaf) • Worst case behavior is the path from the root to the deepest leaf Decision Trees • Thus, to get a lower bound on the worst case behavior we need to find the shortest tree possible that can still hold N! leaves – No comparison-based sort could do better • A tree of depth d can hold 2d leaves – So, what is the minimal d where 2d >= N! • Solving for d we get d >= log2(N!) – The minimal depth must be at least log2(N!) Decision Trees • According to Lemma 7.4 (p. 291): log2(N!) >= n log2(n) – 1.45n • Putting that together with the previous result d must be at least as great as (n log2(n) – 1.45n) • Applying Big-O d must be at least O(n log2(n)) • No comparison-based sorting algorithm can have a running time better than O(n log2(n)) Decision Trees for other problems? • Unfortunately, this is a lot of work and the technique only applies directly to comparisonbased sorts • What if the problem is Traveling Salesperson? – The book has only shown exponential time (or worse) algorithms for this problem – What if your boss wants a faster implementation? • Do you try and find one? • Do you try and prove one doesn’t exist? Polynomial-Time Algorithms • A polynomial-time algorithm is one whose worst-case running time is bounded above by a polynomial function – Poly-time examples: 2n, 3n, n5, n log(n), n100000 – Non-poly-time examples: 2n, 20.000001n, n! • Poly-time is important because for large problem sizes, all non-poly-time algorithms will take forever to execute Intractability • In Computer Science, a problem is called intractable if it is impossible to solve it with a polynomial-time algorithm – Let me stress that intractability is a property of the problem, not just of any one algorithm to solve the problem • There can be no poly-time algorithm that solves the problem if the problem is to be considered intractable • And just because one non-poly-time algorithm exists for the problem does not make it intractable Three Categories of Problems • We can group problems into 3 categories: – Problems for which poly-time algorithms have been found – Problems that have been proven to be intractable • Proven that no poly-time algorithms exist – Problems that have not been proven to be intractable, but for which poly-time algorithms have never been found • No one has found a poly-time algorithm, but no one has proven that one doesn’t exist either • The interesting thing is that tons of problems fall into the 3rd category and almost none into the second Poly-time Category • Any problem for which we have found a poly-time algorithm – Sorting, searching, matrix multiplication, chained matrix multiplication, shortest paths, minimal spanning tree, etc. Intractable Category • Two types of problems – Those that require a non-polynomial amount of output • Determining all Hamiltonian Circuits – (n – 1)! Circuits in worst case – Those that produce a reasonable amount of output, but the processing time is just too long • These are called undecidable problems • Very few of these, but one classic is the Halting Problem – Takes as input any algorithm and any input and will tell you if the algorithm halts when run on the input Unknown Category • Many problems belong in the category – 0-1 Knapsack, Traveling Salesperson, mcoloring, Hamiltonian Circuits, etc. – In general, any problem that we had to solve using backtracking or bounded backtracking falls into this category The Theory of NP • In the following slides we will show a close and interesting relationship among many of the problem in the Unknown Category • It will be more convenient to develop this theory restricting ourselves to decision problems – Problems that have a yes/no answer – We can always convert non-decision problems into decision problems • In 0/1 Knapsack instead of just asking for the optimal profit, we can instead ask if the optimal profit exceeds some number • In graph coloring instead of just asking the minimal number of colors we can instead ask if the minimal number is less than m The Set P • The set P is the set of all decision problems that can be solved by polynomial-time algorithms • What problems are in P? – Obviously all the ones we have found poly-time solutions for (sorting, etc) – What about problems like Traveling Salesperson? The Set NP • The set NP is the set of all decision problems that can be solved by a polynomial-time nondeterministic algorithm – A poly-time non-deterministic algorithm is an algorithm that is broken into 2 stages: • Guessing (non-deterministic) stage • Verification (deterministic) stage – Where the verification stage can be accomplished in poly-time Non-deterministic Algorithms • For Traveling Salesperson: – The guessing stage simply guesses possible tours – The verification stage takes a tour from the guessing stage and decides yes/no • is it a tour that has a total weight of no greater than x? • Obviously, this verification stage can be written in poly-time – Note, however, that the guessing stage may or may not be done in poly-time • Note that the purpose of this type of algorithm is for theory and classification – there are usually much better ways to actually implement the algorithm NP • There are thousands of problems that have been proven to be in NP • Further note that all problems in P are also in NP – The guessing stage can do anything it wants – The verification stage can just run the algorithm • The only problems proven to not be in NP are the intractable ones – And there are only a few of these P and NP • Here is the way the picture of the sets is usually drawn • That is, we know that P is a subset of NP • However, we don’t know if it is a proper subset P and NP • That is, no one has ever proven that there exists a problem in NP that is not in P • So, NP – P could be an empty set – If it is then we say that P = NP • The question of whether P = NP is one of the more intriguing and important questions in all of Computer Science P = NP? • To prove that P NP we would have to find a single problem in NP that is not in P • To prove P = NP we would have to find a polytime algorithm for each problem in NP – This route sounds much harder, but over the next few slides we will show a way to simplify this proof • If you prove either you get an A in the class, not to mention famous – And if you prove P = NP then you also become rich! NP-Complete Problems • Over the next few slides we develop the background for a new set called NP-Complete • This set will help us in attempting to prove that P = NP Conjunctive Normal Form (CNF) • A logical (boolean) variable is one that can have 2 values: true or false • A literal is a logical variable (x) or the negation of a logical variable (!x) • A clause is a sequence of literals separated by Ors: (x1 or x2 or !x3) • A logical expression in conjunctive normal form (CNF) is a sequence of clauses separated by ANDs: (x1 or x2 or !x3) and (x2 or !x1) and (x2 or x3) • We are about to use CNF for our theory, but there are many uses for CNF including AI CNF-Satisfiability Problem • Given a logical expression in CNF, determine whether there is some truth assignment (some set of assignments of true and false to the variables) that makes the whole expression true • (x1 or x2) and (x2 or !x3) and (!x2) – Yes, x1 = true, x2 = false, x3 = false • (x1 or X2) and !x1 and !x2 – No CNF-Satisfiability Problem • It is easy to write a poly-time algorithm that takes as input a logical expression in CNF and a set of true assignments a verifies if the expression is true for that assignment – Therefore, the problem is in the set NP • Further, no one has ever found a poly-time algorithm for this entire problem and no one has ever proven that it cannot be solved in poly-time – So we do not know if it is in P or not CNF-Satisfiability Problem • However, in 1971 Stephen Cook published a paper proving that if CNF-Satisfiability is in P, then P = NP • WOW! If we can just prove that this simple problem is in P then suddenly all NP problems are in P – And all sorts of things like encryption will break! • Before we talk about how this works, we need to introduce the concept of a transformation algorithm Transformation Algorithms • Suppose we want to solve decision problem A and we have an algorithm that solves decision problem B • Suppose that we can write an algorithm that creates an instance y of problem B for every instance x of problem A s.t. – The algorithm for B answers yes for y iff the answer to problem A is yes for x – Such an algorithm is called a transformation algorithm Transformation Algorithms • An simple example of a transformation algorithm (having nothing to do with NP problems) • A: given n logical variables, does at least one have a value true? • B: Given n integers, is the largest of positive? • Our transformation is k1,k2..kn = tran(x1,x2..xn) – Where ki is 1 if xi is true and ki is 0 if xi is false Reducibility • If there exists a poly-time transformation algorithm from decision problem A to decision problem B, then we say A is polytime reducible to B (or A reduces to B) AB • Further, if B is in set P and A B then A is also in set P NP-Complete Problems • A problem B is called NP-Complete if – It is in NP and – For every other problem A in NP, A B • By the previous theorem if we could show that any NP-complete problem is in P then we could conclude that P = NP • So, Cook basically proved that CNF-Satisfiability is NP-complete – Cook did not reduce all other NP problems to CNFSatisfiability, instead he did it by exploiting common properties of all NP problems NP-Complete Problems • Now we can use transitivity to get: – A problem C is NP-complete if: • It is in NP and • For some other NP-complete problem B, B C • Researchers have spent the last 30 years creating transformation for these problems and we now have a list of hundreds of NP-complete problems – If any of these NP-complete problems can be proven to be in P then P = NP • And, additionally, we also have a way to solve all the NP problems (poly-time transformations to the poly-time problem) The State of P and NP • All this sounds promising, but… – Over the last 30 years no one has been able prove that any problem from NP is not in P – Over the last 30 years no one has been able to prove that any problem from NP-complete is in P – Proving either of these things would give us an answer to the open problem of P = NP? • Most people seem to believe that P NP