Back to Blog

NP-Complete Problems: The Hardest Problems in NP

Understanding NP-completeness, polynomial-time reductions, the Cook-Levin theorem, SAT, 3-SAT, and why proving any single NP-complete problem is in P would solve them all.

2025-10-18
Share
Theory of Computationnp-completereductionssat

Terminology

Term Definition
NP-Complete A problem that is both in NP and NP-hard: it is verifiable in polynomial time and at least as hard as every other problem in NP
NP-Hard A problem at least as hard as every problem in NP; it may or may not be in NP itself
Polynomial-time reduction A transformation from problem A to problem B in polynomial time such that a "yes" for A maps to a "yes" for B and vice versa, written $A \le_p B$
SAT (Boolean satisfiability) Given a Boolean formula, is there an assignment of TRUE/FALSE to variables that makes the formula evaluate to TRUE?
3-SAT SAT restricted to CNF formulas where every clause has exactly 3 literals; still NP-complete
CNF (Conjunctive Normal Form) A Boolean formula expressed as an AND of clauses, where each clause is an OR of literals
Cook-Levin theorem The foundational result proving that SAT is NP-complete, establishing the first NP-complete problem
Karp's 21 problems Richard Karp's 1972 list of 21 problems shown to be NP-complete via reductions from SAT
Literal A Boolean variable or its negation, e.g. $x_1$ or $\neg x_1$

What & Why

NP-complete problems are the hardest problems in NP. They sit at the top of the difficulty hierarchy: if you could solve any single NP-complete problem in polynomial time, you could solve every problem in NP in polynomial time, which would prove P = NP.

This is powerful because it means all NP-complete problems are computationally equivalent. Solving one solves them all. Conversely, if any one of them is proven to require exponential time, they all do.

For engineers, recognizing that a problem is NP-complete is a signal to stop looking for an exact polynomial-time algorithm and start looking for approximations, heuristics, or special-case restrictions. It is not a dead end; it is a redirect.

How It Works

What Makes a Problem NP-Complete?

A problem L is NP-complete if:

  1. L \in NP (solutions can be verified in polynomial time)
  2. Every problem in NP can be reduced to L in polynomial time (L is NP-hard)

The Cook-Levin Theorem

In 1971, Stephen Cook (and independently Leonid Levin) proved that SAT is NP-complete. The proof works by showing that any computation of a nondeterministic Turing machine can be encoded as a Boolean formula. If the machine accepts, the formula is satisfiable. This was the first NP-complete problem ever identified.

The Reduction Chain

Once SAT was proven NP-complete, other problems were shown NP-complete by reducing SAT (or another known NP-complete problem) to them:

NP-Completeness Reduction Chain SAT 3-SAT CLIQUE VERTEX COVER SUBSET SUM INDEPENDENT SET HAMILTONIAN PATH KNAPSACK Each arrow is a polynomial-time reduction proving NP-completeness

SAT and 3-SAT in Detail

SAT: Given a Boolean formula in CNF, is there a satisfying assignment?

Example: (x_1 \lor \neg x_2) \land (\neg x_1 \lor x_3) \land (x_2 \lor x_3)

Setting x_1 = T, x_2 = T, x_3 = T satisfies all three clauses.

3-SAT: Same as SAT but every clause has exactly 3 literals. Any SAT instance can be converted to 3-SAT in polynomial time (by splitting longer clauses with auxiliary variables). 3-SAT is NP-complete, but 2-SAT is in P, solvable in O(n) using implication graphs.

How Reductions Work

To prove problem B is NP-complete:

  1. Show B is in NP (give a polynomial-time verifier)
  2. Pick a known NP-complete problem A
  3. Show a polynomial-time transformation from any instance of A to an instance of B
  4. Prove: A has a "yes" answer if and only if the transformed B instance has a "yes" answer
Instance of A (NP-complete) poly-time transform Instance of B (to prove NPC) same answer Yes/No

Complexity Analysis

NP-Complete Problem Best Known Exact Verification Reduced From
SAT $O(2^n)$ $O(n \cdot m)$ Cook-Levin (direct proof)
3-SAT $O(1.33^n)$ $O(n)$ SAT
Clique $O(2^{n/3})$ $O(k^2)$ 3-SAT
Vertex Cover $O(1.28^k \cdot n)$ $O(V + E)$ 3-SAT / Clique
Subset Sum $O(2^{n/2})$ $O(n)$ 3-SAT / Vertex Cover

The key insight: all these problems have exponential solving times but polynomial verification times. A polynomial-time algorithm for any one of them would give polynomial-time algorithms for all of them.

Implementation

ALGORITHM Reduce3SATtoClique(formula)
INPUT: formula: 3-SAT instance with m clauses over n variables
OUTPUT: graph G and integer k such that formula is satisfiable iff G has a k-clique
BEGIN
  k <- m
  V <- empty set
  E <- empty set

  FOR i <- 1 TO m DO
    FOR EACH literal l IN clause_i DO
      ADD node (i, l) TO V
    END FOR
  END FOR

  FOR EACH pair of nodes (i, l1) and (j, l2) WHERE i != j DO
    IF l1 != NOT(l2) THEN
      ADD edge ((i, l1), (j, l2)) TO E
    END IF
  END FOR

  RETURN (Graph(V, E), k)
END

ALGORITHM VerifyNPCompleteness(problem)
INPUT: problem: a decision problem
OUTPUT: steps to prove NP-completeness
BEGIN
  STEP 1: Show problem is in NP
    Define a certificate (witness) for "yes" instances
    Show a verifier checks the certificate in polynomial time

  STEP 2: Choose a known NP-complete problem A
    Pick the problem whose structure most closely matches

  STEP 3: Construct polynomial-time reduction A ->p problem
    Transform any instance of A into an instance of problem
    Prove: A is "yes" if and only if problem is "yes"

  STEP 4: Verify reduction runs in polynomial time
    The transformation itself must be O(n^k) for some constant k
END

ALGORITHM BruteForceSAT(formula, variables)
INPUT: formula: CNF with m clauses, variables: list of n Boolean variables
OUTPUT: satisfying assignment or UNSATISFIABLE
BEGIN
  FOR EACH assignment IN {TRUE, FALSE}^n DO
    all_satisfied <- TRUE
    FOR EACH clause IN formula DO
      IF NOT EvaluateClause(clause, assignment) THEN
        all_satisfied <- FALSE
        BREAK
      END IF
    END FOR
    IF all_satisfied THEN
      RETURN assignment
    END IF
  END FOR
  RETURN UNSATISFIABLE
END

Real-World Applications

  • SAT solvers: despite worst-case exponential time, modern SAT solvers (MiniSat, Z3) handle industrial instances with millions of variables using clever heuristics like DPLL and CDCL
  • Hardware verification: chip manufacturers encode circuit correctness as SAT instances and use solvers to find bugs before fabrication
  • Scheduling: job-shop scheduling, exam timetabling, and airline crew scheduling are NP-complete; airlines use constraint programming and integer linear programming relaxations
  • Network design: finding minimum vertex covers for network monitoring, or maximum cliques for community detection in social networks
  • Bioinformatics: sequence alignment and phylogenetic tree construction involve NP-complete subproblems, solved with dynamic programming heuristics

Key Takeaways

  • NP-complete problems are the hardest problems in NP: solving any one in polynomial time solves them all
  • The Cook-Levin theorem established SAT as the first NP-complete problem by encoding Turing machine computation as Boolean formulas
  • Polynomial-time reductions ($A \le_p B$) are the tool for proving new problems NP-complete: transform A into B preserving yes/no answers
  • 3-SAT is NP-complete but 2-SAT is in P, showing that small structural changes can dramatically shift complexity
  • Recognizing NP-completeness in practice is a signal to use approximation algorithms, heuristics, or SAT/ILP solvers rather than seeking exact polynomial-time solutions