Monday 27 June 2016

What are P, NP, NP-complete, and NP-hard?

Problems in class P can be solved with algorithms that run in polynomial time.

Say you have an algorithm that finds the smallest integer in an array.  One way to do this is by iterating over all the integers of the array and keeping track of the smallest number you've seen up to that point.  Every time you look at an element, you compare it to the current minimum, and if it's smaller, you update the minimum.

How long does this take?  Let's say there are n elements in the array.  For every element the algorithm has to perform a constant number of operations.  Therefore we can say that the algorithm runs in O(n) time, or that the runtime is a linear function of how many elements are in the array.*  So this algorithm runs in linear time.

You can also have algorithms that run in quadratic time (O(n^2)), exponential time (O(2^n)), or even logarithmic time (O(log n)).  Binary search (on a balanced tree) runs in logarithmic time because the height of the binary search tree is a logarithmic function of the number of elements in the tree.

If the running time is some polynomial function of the size of the input**, for instance if the algorithm runs in linear time or quadratic time or cubic time, then we say the algorithm runs in polynomial time and the problem it solves is in class P.

NP

Now there are a lot of programs that don't (necessarily) run in polynomial time on a regular computer, but do run in polynomial time on a nondeterministic Turing machine.  These programs solve problems in NP, which stands for nondeterministic polynomial time.  A nondeterministic Turing machine can do everything a regular computer can and more.***  This means all problems in P are also in NP.

An equivalent way to define NP is by pointing to the problems that can be verified in polynomial time.  This means there is not necessarily a polynomial-time way to find a solution, but once you have a solution it only takes polynomial time to verify that it is correct.

Some people think P = NP, which means any problem that can be verified in polynomial time can also be solved in polynomial time and vice versa.  If they could prove this, it would revolutionize computer science because people would be able to construct faster algorithms for a lot of important problems. 


Ex: Subset Sum, 0-1 knapsack etc.
 
NP-hard

What does NP-hard mean?  A lot of times you can solve a problem by reducing it to a different problem.  I can reduce Problem B to Problem A if, given a solution to Problem A, I can easily construct a solution to Problem B.  (In this case, "easily" means "in polynomial time.")

If a problem is NP-hard, this means I can reduce any problem in NP to that problem.  This means if I can solve that problem, I can easily solve any problem in NP.  If we could solve an NP-hard problem in polynomial time, this would prove P = NP.






Ex: Travelling Salesman, Subset Sum etc.
 
NP-complete

A problem is NP-complete if the problem is both

  • NP-hard, and
  • in NP.

Note: The interesting observation here is that, we are still unable to find a single problem that is NP, but not NP-Hard.