Notes/Turing Machines and Computability

Turing Machines and Computability

The universal Turing machine, the Church-Turing thesis, the Halting Problem, undecidability, and why some problems can never be solved by any computer.

2025-10-22AI-Synthesized from Personal Notes

Theory of ComputationTuring MachinesHalting ProblemComputability

Terminology

Term	Definition
Turing machine (TM)	A theoretical machine with an infinite tape, a read/write head, and a finite set of states; it can read, write, and move left or right on the tape
Universal Turing machine (UTM)	A Turing machine that can simulate any other Turing machine given its description and input; the theoretical basis for general-purpose computers
Church-Turing thesis	The hypothesis that any function computable by an effective procedure can be computed by a Turing machine; not a theorem but universally accepted
Halting Problem	The problem of determining whether a given Turing machine halts on a given input; proven undecidable by Alan Turing in 1936
Decidable (recursive)	A language for which a Turing machine always halts and correctly answers yes or no for every input
Recognizable (recursively enumerable)	A language for which a Turing machine halts and accepts on "yes" inputs but may loop forever on "no" inputs
Undecidable	A problem for which no Turing machine (and therefore no algorithm) can always produce the correct yes/no answer
Reduction (computability)	Showing problem A is undecidable by proving that if A were decidable, the Halting Problem would also be decidable (contradiction)
Turing-complete	A system that can simulate a Turing machine; most programming languages, cellular automata (Rule 110), and even some games are Turing-complete

What & Why

The Turing machine is the most powerful model of computation we know. Every algorithm that can be executed on any computer, in any programming language, can be simulated by a Turing machine. This is the Church-Turing thesis, and it has held up for nearly 90 years.

But the Turing machine also reveals the limits of computation. Alan Turing proved in 1936 that the Halting Problem is undecidable: no algorithm can determine, for every possible program and input, whether that program will eventually stop or run forever. This is not a limitation of current technology. It is a mathematical impossibility.

Understanding Turing machines matters because they define what is computable. Every time you wonder "can a computer solve this?", the answer depends on whether a Turing machine can solve it. If not, no amount of hardware, parallelism, or clever programming will help.

How It Works

Turing Machine Structure

A Turing machine is a 7-tuple $(Q, \Sigma, \Gamma, \delta, q_0, q_{\text{accept}}, q_{\text{reject}})$:

$Q$: finite set of states
$\Sigma$: input alphabet (not including the blank symbol)
$\Gamma$: tape alphabet ($\Sigma \subset \Gamma$, includes blank symbol $\sqcup$)
$\delta: Q \times \Gamma \to Q \times \Gamma \times {L, R}$: transition function
$q_0$: start state
$q_{\text{accept}}$, $q_{\text{reject}}$: halting states

At each step, the machine reads the symbol under the head, consults the transition function, writes a new symbol, moves the head left or right, and transitions to a new state. It halts when it enters $q_{\text{accept}}$ or $q_{\text{reject}}$.

The Universal Turing Machine

A universal Turing machine (UTM) takes as input the description of any Turing machine $M$ and an input $w$, then simulates $M$ on $w$. This is the theoretical foundation of the stored-program computer: one machine that can run any program.

Your laptop is a physical approximation of a UTM. The key difference: a real computer has finite memory, while a UTM has an infinite tape. But for any fixed computation, a sufficiently large computer behaves identically to a UTM.

The Halting Problem

Theorem: There is no Turing machine $H$ that, for every Turing machine $M$ and input $w$, correctly decides whether $M$ halts on $w$.

Proof by contradiction (diagonalization):

Assume $H(M, w)$ exists and returns "halts" or "loops" for every $(M, w)$.
Construct a new machine $D$ that takes input $M$:
- Run $H(M, M)$
- If $H$ says "halts," then $D$ loops forever
- If $H$ says "loops," then $D$ halts
Now ask: what does $D(D)$ do?
- If $D(D)$ halts, then $H(D, D)$ said "halts," so $D$ should loop. Contradiction.
- If $D(D)$ loops, then $H(D, D)$ said "loops," so $D$ should halt. Contradiction.
Therefore $H$ cannot exist.

Decidable vs Recognizable vs Undecidable

Complexity Analysis

Problem	Decidability	Notes
Does DFA $M$ accept string $w$?	Decidable, $O(n)$	Simulate the DFA
Is CFG $G$ empty?	Decidable	Mark reachable non-terminals
Are two DFAs equivalent?	Decidable	Minimize and compare
Does TM $M$ halt on input $w$?	Undecidable	The Halting Problem
Does TM $M$ halt on all inputs?	Undecidable	Even harder than Halting
Are two CFGs equivalent?	Undecidable	Reduces from Post Correspondence

Implementation

STRUCTURE TuringMachine
  states: set of state identifiers
  inputAlphabet: set of symbols
  tapeAlphabet: set of symbols (includes blank)
  transition: map from (state, symbol) to (state, symbol, direction)
  startState: initial state
  acceptState: accepting halt state
  rejectState: rejecting halt state
END STRUCTURE

ALGORITHM SimulateTM(tm, input)
INPUT: tm: TuringMachine, input: string
OUTPUT: ACCEPT, REJECT, or LOOP (may not terminate)
BEGIN
  tape <- array initialized with input symbols, rest blank
  head <- 0
  state <- tm.startState

  WHILE TRUE DO
    IF state = tm.acceptState THEN
      RETURN ACCEPT
    END IF
    IF state = tm.rejectState THEN
      RETURN REJECT
    END IF

    symbol <- tape[head]
    (newState, writeSymbol, direction) <- tm.transition(state, symbol)

    tape[head] <- writeSymbol
    state <- newState

    IF direction = R THEN
      head <- head + 1
    ELSE
      head <- head - 1
    END IF

    IF head < 0 THEN
      Extend tape to the left
      head <- 0
    END IF
  END WHILE
END

ALGORITHM SimulateUTM(machineDescription, input)
INPUT: machineDescription: encoded TM, input: string
OUTPUT: whatever the described TM outputs
BEGIN
  tm <- Decode(machineDescription)
  RETURN SimulateTM(tm, input)
END

ALGORITHM BoundedHaltCheck(tm, input, maxSteps)
INPUT: tm: TuringMachine, input: string, maxSteps: step limit
OUTPUT: HALTS, LOOPS, or UNKNOWN
BEGIN
  tape <- array initialized with input
  head <- 0
  state <- tm.startState
  steps <- 0

  WHILE steps < maxSteps DO
    IF state = tm.acceptState OR state = tm.rejectState THEN
      RETURN HALTS
    END IF
    Execute one step of tm
    steps <- steps + 1
  END WHILE

  RETURN UNKNOWN
END

Real-World Applications

Compiler correctness: Rice's theorem (a consequence of undecidability) proves that no compiler can perfectly detect all semantic properties of programs, like whether a function always returns a value
Static analysis limits: tools like linters and type checkers are conservative approximations because perfect analysis is undecidable; they may report false positives to stay safe
Turing-completeness in unexpected places: SQL (with recursive CTEs), CSS (with specific constructs), Excel formulas, and even Conway's Game of Life are Turing-complete, meaning they can simulate any computation
Security: the undecidability of the Halting Problem means you cannot build a perfect virus scanner that detects all malicious programs without false positives or false negatives
Formal verification: while general program verification is undecidable, restricted verification (model checking for finite-state systems) is decidable and widely used in hardware and protocol verification

Key Takeaways

Turing machines define the boundary of what is computable: if a Turing machine cannot solve a problem, no computer can
The universal Turing machine is the theoretical foundation of the stored-program computer, one machine that runs any program
The Halting Problem is undecidable: no algorithm can determine whether an arbitrary program halts on an arbitrary input
The Church-Turing thesis asserts that Turing machines capture all of "effective computation," and no stronger model has been found
Undecidability has practical consequences: perfect static analysis, perfect virus detection, and perfect compiler optimization are all impossible in general

2025-10-23

Amortized Analysis: The True Cost of Operations

Understanding amortized analysis through the aggregate method, accounting method, and potential method, with dynamic array and splay tree examples showing why worst-case per-operation analysis can be misleading.

2025-10-24

Randomized Algorithms and Probabilistic Analysis

Las Vegas vs Monte Carlo algorithms, expected running time, randomized quicksort, skip list analysis, and why adding randomness can make algorithms simpler and faster.

2025-10-25

Information Theory Basics: Entropy, Compression, and Shannon's Theorem

Shannon entropy, the source coding theorem, compression limits, Huffman coding, and why you can never compress truly random data.