Matrix Multiplication from Scratch
A dead-simple, step-by-step walkthrough of how matrix multiplication actually works, with every single calculation shown.
What Is a Matrix?
A matrix is just a grid of numbers arranged in rows and columns. We describe its size as $\text{rows} \times \text{cols}$.
A $2 \times 3$ matrix has 2 rows and 3 columns:
That's it. It's a box of numbers. Nothing scary.
The One Rule You Need to Know
You can only multiply matrix $A$ by matrix $B$ if the number of columns in A equals the number of rows in B.
The inner dimensions must match. The result takes the outer dimensions.
Example: a $(2 \times \mathbf{3})$ times a $(\mathbf{3} \times 2)$ works and gives you a $(2 \times 2)$ result.
A $(2 \times 3)$ times a $(4 \times 2)$? Nope. 3 does not equal 4. Can't do it.
The Algorithm: Dot Product, Cell by Cell
To fill in cell $C_{ij}$ (row $i$, column $j$ of the result), you take row $i$ of A and column $j$ of B, multiply them element-by-element, and add it all up.
That formula looks fancy but it just means: walk across the row and down the column, multiply pairs, sum them up.
Full Worked Example: 2x3 times 3x2
Let's multiply these two matrices, showing every single step.
$A$ is $2 \times 3$, $B$ is $3 \times 2$. Inner dimensions match (both 3). Result $C$ will be $2 \times 2$.
We need to compute 4 cells. Let's go one at a time.
Step 1: Compute $C_{11}$
Take row 1 of A and column 1 of B:
Multiply each pair and add:
Step 2: Compute $C_{12}$
Take row 1 of A and column 2 of B:
Multiply each pair and add:
Step 3: Compute $C_{21}$
Take row 2 of A and column 1 of B:
Multiply each pair and add:
Step 4: Compute $C_{22}$
Take row 2 of A and column 2 of B:
Multiply each pair and add:
The Final Result
Putting all 4 cells together:
That's the whole thing. Four dot products, four cells, done.
Visualizing It
Here's what's happening for $C_{11}$. You're sliding row 1 of A across column 1 of B:
The General Formula
For any two compatible matrices $A$ (size $m \times n$) and $B$ (size $n \times p$), the result $C = A \times B$ is an $m \times p$ matrix where each cell is:
In plain English: to get the cell at row $i$, column $j$ of the result, take the dot product of row $i$ from the first matrix and column $j$ from the second matrix.
Complexity
For two $n \times n$ matrices, you compute $n^2$ cells, and each cell requires $n$ multiplications and $n-1$ additions. Total work:
Our $2 \times 3$ by $3 \times 2$ example had $2 \times 2 = 4$ cells, each needing 3 multiplications. That's $4 \times 3 = 12$ multiplications total.
For a $1000 \times 1000$ matrix? That's $1000^3 = 1{,}000{,}000{,}000$ multiplications. One billion. This is exactly why GPUs exist: they can do thousands of those dot products in parallel instead of one at a time.
Key Properties
A few things worth knowing about matrix multiplication:
- Not commutative: $A \times B \neq B \times A$ in general. Order matters.
- Associative: $(A \times B) \times C = A \times (B \times C)$. You can regroup, just don't reorder.
- Distributive: $A \times (B + C) = A \times B + A \times C$.
- Identity matrix: $A \times I = A$. The identity matrix $I$ has 1s on the diagonal and 0s everywhere else. It's the "multiply by 1" of matrices.
Why This Matters
Matrix multiplication is the single most important operation in:
- AI/ML: Every neural network layer is a matrix multiply. Training GPT-4 is billions of these.
- Computer graphics: Every 3D rotation, scaling, and projection is a $4 \times 4$ matrix multiply.
- Physics simulations: Solving systems of linear equations boils down to matrix operations.
- Signal processing: Fourier transforms can be expressed as matrix multiplications.
If you understand this one operation deeply, you understand the computational core of modern computing.