Back to Blog

NP-Hard and Approximation Algorithms

When exact solutions are impossible at scale, approximation algorithms provide provable guarantees. Covering NP-hardness, the traveling salesman problem, knapsack, and greedy/DP approximation strategies.

2025-10-19
Share
Theory of Computationnp-hardapproximation-algorithms

Terminology

Term Definition
NP-Hard A problem at least as hard as every problem in NP; it need not be in NP itself (its solutions may not even be verifiable in polynomial time)
Approximation algorithm A polynomial-time algorithm that finds a solution within a provable factor of the optimal solution
Approximation ratio ($\alpha$) The worst-case ratio between the approximate solution and the optimal solution; $\alpha = 1$ means exact
PTAS Polynomial-Time Approximation Scheme: an algorithm that achieves a $(1+\epsilon)$ ratio for any $\epsilon > 0$, with running time polynomial in $n$ (but possibly exponential in $1/\epsilon$)
FPTAS Fully Polynomial-Time Approximation Scheme: a PTAS where the running time is also polynomial in $1/\epsilon$
TSP (Traveling Salesman Problem) Find the shortest tour visiting every city exactly once and returning to the start; NP-hard in general
Metric TSP TSP where distances satisfy the triangle inequality; admits a 1.5-approximation (Christofides' algorithm)
Inapproximability A proof that no polynomial-time algorithm can achieve better than a certain approximation ratio (unless P = NP)
Greedy heuristic An algorithm that makes the locally optimal choice at each step; sometimes provides provable approximation guarantees

What & Why

NP-hard problems are at least as hard as the hardest problems in NP. Unlike NP-complete problems, NP-hard problems do not need to be in NP at all. The Halting Problem, for example, is NP-hard but undecidable, meaning it cannot even be solved in any finite time.

In practice, many optimization problems you encounter are NP-hard: finding the shortest delivery route, packing items into bins, scheduling jobs on machines. Since exact solutions require exponential time, you need a different strategy. Approximation algorithms provide that strategy: they run in polynomial time and guarantee that the solution is within a known factor of optimal.

The beauty of approximation algorithms is that the guarantee is provable, not just empirical. A 2-approximation for vertex cover means the algorithm will never return a cover more than twice the size of the optimal one, no matter the input.

How It Works

NP-Hard vs NP-Complete

NP-Hard NP P NP-Complete (NP-Hard AND in NP) Halting Problem General TSP opt. (NP-Hard, not in NP)

NP-complete problems are the intersection of NP and NP-hard. Problems that are NP-hard but not in NP are even harder: you cannot even verify a solution efficiently.

Approximation Strategies

Strategy 1: Greedy with guarantees. The greedy vertex cover algorithm picks an edge, adds both endpoints, removes all covered edges, and repeats. This gives a 2-approximation: the cover is at most twice the optimal size.

Strategy 2: LP relaxation and rounding. Relax integer constraints to continuous ones, solve the linear program in polynomial time, then round the fractional solution to integers. The rounding introduces a bounded error.

Strategy 3: Dynamic programming on restricted inputs. The 0/1 knapsack problem has a pseudo-polynomial DP solution in O(nW). By scaling and rounding weights, you get an FPTAS with (1+\epsilon) approximation ratio.

The Approximation Landscape

Approximation Quality Spectrum FPTAS Knapsack (1+e) ratio PTAS Euclidean TSP (1+e) ratio Constant Vertex Cover: 2x Metric TSP: 1.5x Log factor Set Cover: O(log n) None General TSP Inapproximable Better Worse

Complexity Analysis

Problem Exact Time Approx. Ratio Approx. Time
Vertex Cover $O(2^k \cdot n)$ 2 $O(V + E)$
Metric TSP $O(n^2 \cdot 2^n)$ 1.5 (Christofides) $O(n^3)$
0/1 Knapsack $O(nW)$ pseudo-poly $(1+\epsilon)$ FPTAS $O(n^2 / \epsilon)$
Set Cover $O(2^n)$ $O(\ln n)$ $O(n \cdot m)$
General TSP $O(n^2 \cdot 2^n)$ No constant ratio possible N/A

Implementation

ALGORITHM GreedyVertexCover(graph)
INPUT: graph: undirected graph (V, E)
OUTPUT: vertex cover C (set of vertices covering all edges)
BEGIN
  C <- empty set
  E_remaining <- copy of E

  WHILE E_remaining is not empty DO
    Pick any edge (u, v) from E_remaining
    C <- C UNION {u, v}
    Remove all edges incident to u or v from E_remaining
  END WHILE

  RETURN C
END

ALGORITHM GreedyKnapsack(items, capacity)
INPUT: items: list of (weight, value) pairs, capacity: W
OUTPUT: approximate maximum value
BEGIN
  SORT items by value/weight ratio in decreasing order
  totalValue <- 0
  remainingCapacity <- capacity

  FOR EACH (w, v) IN items DO
    IF w <= remainingCapacity THEN
      totalValue <- totalValue + v
      remainingCapacity <- remainingCapacity - w
    END IF
  END FOR

  RETURN totalValue
END

ALGORITHM NearestNeighborTSP(cities, distanceMatrix)
INPUT: cities: list of n cities, distanceMatrix: n x n distance table
OUTPUT: tour visiting all cities
BEGIN
  visited <- {cities[0]}
  tour <- [cities[0]]
  current <- cities[0]

  WHILE |visited| < n DO
    nearest <- city in (cities - visited) minimizing distanceMatrix[current][city]
    tour <- tour + [nearest]
    visited <- visited UNION {nearest}
    current <- nearest
  END WHILE

  tour <- tour + [cities[0]]
  RETURN tour
END

Real-World Applications

  • Delivery routing: companies like UPS and FedEx solve TSP variants daily using approximation algorithms and local search heuristics; even a 1% improvement saves millions in fuel costs
  • Cloud resource allocation: bin packing (NP-hard) determines how virtual machines are placed on physical servers; first-fit decreasing gives a provable approximation
  • Network design: Steiner tree problems (connecting a subset of nodes at minimum cost) are NP-hard; 2-approximation algorithms are used in VLSI routing and telecommunications
  • Warehouse optimization: knapsack variants determine which products to stock given limited shelf space, solved with FPTAS-based approaches
  • Genome assembly: reconstructing DNA sequences from short reads involves NP-hard shortest superstring problems, tackled with greedy approximations

Key Takeaways

  • NP-hard problems are at least as hard as everything in NP, but they may not even be in NP (solutions might not be verifiable in polynomial time)
  • Approximation algorithms trade exactness for speed: they run in polynomial time and guarantee a solution within a provable factor of optimal
  • The approximation landscape ranges from FPTAS (arbitrarily close to optimal) to inapproximable (no constant-factor guarantee exists unless P = NP)
  • Greedy algorithms often provide surprisingly good approximation ratios: 2x for vertex cover, $O(\ln n)$ for set cover
  • Recognizing a problem as NP-hard is not a dead end; it is a signal to choose the right approximation strategy for your accuracy and speed requirements