Finite Fields (Galois Fields): The Algebra Inside AES and Reed-Solomon
How GF(2^n) arithmetic works, why AES uses GF(2^8), and how Reed-Solomon codes protect your data from QR codes to deep-space probes.
Terminology
| Term | Definition |
|---|---|
| Galois field $GF(q)$ | A finite field with exactly $q$ elements. Exists if and only if $q = p^n$ for some prime $p$ and positive integer $n$. |
| $GF(2)$ | The field $\{0, 1\}$ with addition = XOR and multiplication = AND |
| $GF(2^n)$ | An extension field with $2^n$ elements, constructed using polynomials over $GF(2)$ modulo an irreducible polynomial |
| Irreducible polynomial | A polynomial that cannot be factored into lower-degree polynomials over the given field (analogous to a prime number) |
| Extension field | A larger field built from a smaller one by adjoining a root of an irreducible polynomial |
| Reed-Solomon code | An error-correcting code based on polynomial evaluation and interpolation over a finite field |
| RAID | Redundant Array of Independent Disks, using parity and Galois field arithmetic for data recovery |
What & Why
Finite fields (also called Galois fields, after Evariste Galois) are fields with a finite number of elements. The remarkable fact is that a finite field of size q exists if and only if q is a prime power p^n, and it is unique up to isomorphism. We write it as GF(q) or \mathbb{F}_q.
For CS, the most important finite fields are the binary extension fields GF(2^n). These fields have 2^n elements, and their arithmetic maps naturally onto bitwise operations, making them extremely efficient to implement in hardware and software.
Where do Galois fields appear in practice?
- AES encryption: the SubBytes transformation computes multiplicative inverses in
GF(2^8), the field with 256 elements. Every byte of data you encrypt passes through Galois field arithmetic. - Reed-Solomon codes: QR codes, CDs, DVDs, Blu-ray discs, and deep-space communication all use Reed-Solomon error correction, which performs polynomial arithmetic over
GF(2^8)or similar fields. - RAID storage: RAID-6 uses
GF(2^8)arithmetic to compute two parity blocks, enabling recovery from any two simultaneous disk failures. - CRC checksums: while CRCs use polynomial arithmetic over
GF(2), understanding the field structure explains why certain generator polynomials detect more errors than others.
How It Works
Constructing GF(2^n)
To build $GF(2^n)$, we start with $GF(2) = \{0, 1\}$ and choose an irreducible polynomial $p(x)$ of degree $n$ over $GF(2)$. The elements of $GF(2^n)$ are all polynomials of degree less than $n$ with coefficients in $\{0, 1\}$:
Each element is an $n$-bit binary string. Addition is polynomial addition with coefficients mod 2 (XOR). Multiplication is polynomial multiplication mod $p(x)$, with coefficients mod 2.
GF(2^8) for AES
AES uses the irreducible polynomial $p(x) = x^8 + x^4 + x^3 + x + 1$ (hex: 0x11B). Each byte is an element of $GF(2^8)$.
Why Irreducible Polynomials Matter
An irreducible polynomial over $GF(2)$ plays the same role as a prime number in modular arithmetic. Just as $\mathbb{Z}/p\mathbb{Z}$ is a field when $p$ is prime, $GF(2)[x] / p(x)$ is a field when $p(x)$ is irreducible. If $p(x)$ were reducible, we would get zero divisors and lose the ability to divide.
Reed-Solomon Error Correction
Reed-Solomon codes encode a message of $k$ symbols as a polynomial of degree $k-1$ over $GF(2^8)$, then evaluate it at $n > k$ points. The $n$ evaluation results form the codeword. If up to $t = (n-k)/2$ symbols are corrupted, the original polynomial (and message) can be recovered using the Berlekamp-Massey algorithm or Euclidean decoding.
This is why a scratched CD can still play: the Reed-Solomon code corrects the corrupted bytes using Galois field arithmetic.
Complexity Analysis
| Operation in $GF(2^n)$ | Time | Notes |
|---|---|---|
| Addition (XOR) | $O(1)$ | Single XOR instruction for $n \leq 64$ |
| Multiplication (carry-less) | $O(n^2)$ naive, $O(1)$ with hardware | Intel PCLMULQDQ instruction for $GF(2^{64})$ |
| Reduction mod $p(x)$ | $O(n)$ | Shift-and-XOR with the irreducible polynomial |
| Inverse (extended Euclidean) | $O(n^2)$ | Polynomial GCD over $GF(2)$ |
| Inverse (lookup table for $GF(2^8)$) | $O(1)$ | 256-entry precomputed table |
| Reed-Solomon decode ($n$ symbols, $t$ errors) | $O(n \cdot t)$ | Berlekamp-Massey + Chien search |
For $GF(2^8)$, all operations can be done with lookup tables (256 entries each for multiplication, inverse, and log/antilog), making them extremely fast in practice.
Implementation
ALGORITHM GF2n_Multiply(a, b, irreducible_poly, n)
INPUT: a, b: elements of GF(2^n) represented as n-bit integers
irreducible_poly: the irreducible polynomial as an integer
n: the field extension degree
OUTPUT: a * b in GF(2^n)
BEGIN
result := 0
WHILE b > 0 DO
IF b AND 1 = 1 THEN
result := result XOR a // add a to result (GF(2) addition = XOR)
END IF
a := a SHIFT_LEFT 1 // multiply a by x
IF a AND (1 SHIFT_LEFT n) != 0 THEN
a := a XOR irreducible_poly // reduce mod p(x)
END IF
b := b SHIFT_RIGHT 1
END WHILE
RETURN result
END
ALGORITHM GF2n_Inverse(a, irreducible_poly, n)
INPUT: a: nonzero element of GF(2^n)
irreducible_poly: the irreducible polynomial
n: the field extension degree
OUTPUT: a^{-1} in GF(2^n)
BEGIN
// Use Fermat's little theorem: a^{-1} = a^{2^n - 2} in GF(2^n)
exponent := (1 SHIFT_LEFT n) - 2
RETURN GF2n_Power(a, exponent, irreducible_poly, n)
END
ALGORITHM BuildLogAntilogTables(generator, irreducible_poly, n)
INPUT: generator: a primitive element of GF(2^n)
irreducible_poly: the irreducible polynomial
n: field extension degree
OUTPUT: log_table, antilog_table for O(1) multiplication
BEGIN
field_size := (1 SHIFT_LEFT n)
antilog_table := array of field_size entries
log_table := array of field_size entries
val := 1
FOR i FROM 0 TO field_size - 2 DO
antilog_table[i] := val
log_table[val] := i
val := GF2n_Multiply(val, generator, irreducible_poly, n)
END FOR
// Now multiply: a * b = antilog[log[a] + log[b] mod (2^n - 1)]
// This converts multiplication to table lookups and one addition
RETURN (log_table, antilog_table)
END
Real-World Applications
- AES (Advanced Encryption Standard): every byte processed by AES passes through $GF(2^8)$ arithmetic. The SubBytes step computes multiplicative inverses, and MixColumns performs matrix multiplication over $GF(2^8)$.
- QR codes: the error correction in QR codes uses Reed-Solomon codes over $GF(2^8)$, allowing scanners to read damaged or partially obscured codes.
- CD/DVD/Blu-ray: optical media use layered Reed-Solomon codes (Cross-Interleaved Reed-Solomon Coding) to handle scratches and manufacturing defects.
- RAID-6 storage: dual-parity RAID uses $GF(2^8)$ arithmetic to compute two independent parity blocks, enabling recovery from any two simultaneous disk failures.
- Deep-space communication: NASA's Voyager probes use Reed-Solomon codes over Galois fields to transmit data reliably across billions of kilometers.
- Ethernet and Wi-Fi: CRC-32 (used in Ethernet frames) is polynomial arithmetic over $GF(2)$, and modern Wi-Fi uses LDPC codes with Galois field operations.
Key Takeaways
- A finite field of size $q$ exists if and only if $q = p^n$ for a prime $p$, and it is unique up to isomorphism
- $GF(2^n)$ elements are $n$-bit strings, addition is XOR, and multiplication is polynomial multiplication mod an irreducible polynomial
- AES uses $GF(2^8)$ with irreducible polynomial $x^8 + x^4 + x^3 + x + 1$ for its core byte transformations
- Reed-Solomon codes perform polynomial interpolation over finite fields to correct errors, powering QR codes, optical media, and space communication
- Log/antilog tables reduce $GF(2^8)$ multiplication to a single table lookup, making field arithmetic extremely fast in practice
- RAID-6 dual parity uses Galois field arithmetic to survive any two simultaneous disk failures