Notes/Matrix Multiplication from Scratch

Matrix Multiplication from Scratch

A dead-simple, step-by-step walkthrough of how matrix multiplication actually works, with every single calculation shown.

2021-01-25AI-Synthesized from Personal Notes

MathLinear AlgebraMatricesFundamentals

What Is a Matrix?

A matrix is just a grid of numbers arranged in rows and columns. We describe its size as $\text{rows} \times \text{cols}$.

A $2 \times 3$ matrix has 2 rows and 3 columns:

$$A = \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \end{bmatrix}$$

That's it. It's a box of numbers. Nothing scary.

The One Rule You Need to Know

You can only multiply matrix $A$ by matrix $B$ if the number of columns in A equals the number of rows in B.

$$A \text{ is } (m \times \mathbf{n}) \quad \times \quad B \text{ is } (\mathbf{n} \times p) \quad = \quad C \text{ is } (m \times p)$$

The inner dimensions must match. The result takes the outer dimensions.

Example: a $(2 \times \mathbf{3})$ times a $(\mathbf{3} \times 2)$ works and gives you a $(2 \times 2)$ result.

A $(2 \times 3)$ times a $(4 \times 2)$? Nope. 3 does not equal 4. Can't do it.

The Algorithm: Dot Product, Cell by Cell

To fill in cell $C_{ij}$ (row $i$, column $j$ of the result), you take row $i$ of A and column $j$ of B, multiply them element-by-element, and add it all up.

$$C_{ij} = \sum_{k=1}^{n} A_{ik} \cdot B_{kj}$$

That formula looks fancy but it just means: walk across the row and down the column, multiply pairs, sum them up.

Full Worked Example: 2x3 times 3x2

Let's multiply these two matrices, showing every single step.

$$A = \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \end{bmatrix} \qquad B = \begin{bmatrix} 7 & 8 \\ 9 & 10 \\ 11 & 12 \end{bmatrix}$$

$A$ is $2 \times 3$, $B$ is $3 \times 2$. Inner dimensions match (both 3). Result $C$ will be $2 \times 2$.

$$C = \begin{bmatrix} C_{11} & C_{12} \\ C_{21} & C_{22} \end{bmatrix}$$

We need to compute 4 cells. Let's go one at a time.

Step 1: Compute $C_{11}$

Take row 1 of A and column 1 of B:

$$\text{Row 1 of A} = \begin{bmatrix} 1 & 2 & 3 \end{bmatrix} \qquad \text{Col 1 of B} = \begin{bmatrix} 7 \\ 9 \\ 11 \end{bmatrix}$$

Multiply each pair and add:

$$C_{11} = (1 \times 7) + (2 \times 9) + (3 \times 11) = 7 + 18 + 33 = \mathbf{58}$$

Step 2: Compute $C_{12}$

Take row 1 of A and column 2 of B:

$$\text{Row 1 of A} = \begin{bmatrix} 1 & 2 & 3 \end{bmatrix} \qquad \text{Col 2 of B} = \begin{bmatrix} 8 \\ 10 \\ 12 \end{bmatrix}$$

Multiply each pair and add:

$$C_{12} = (1 \times 8) + (2 \times 10) + (3 \times 12) = 8 + 20 + 36 = \mathbf{64}$$

Step 3: Compute $C_{21}$

Take row 2 of A and column 1 of B:

$$\text{Row 2 of A} = \begin{bmatrix} 4 & 5 & 6 \end{bmatrix} \qquad \text{Col 1 of B} = \begin{bmatrix} 7 \\ 9 \\ 11 \end{bmatrix}$$

Multiply each pair and add:

$$C_{21} = (4 \times 7) + (5 \times 9) + (6 \times 11) = 28 + 45 + 66 = \mathbf{139}$$

Step 4: Compute $C_{22}$

Take row 2 of A and column 2 of B:

$$\text{Row 2 of A} = \begin{bmatrix} 4 & 5 & 6 \end{bmatrix} \qquad \text{Col 2 of B} = \begin{bmatrix} 8 \\ 10 \\ 12 \end{bmatrix}$$

Multiply each pair and add:

$$C_{22} = (4 \times 8) + (5 \times 10) + (6 \times 12) = 32 + 50 + 72 = \mathbf{154}$$

The Final Result

Putting all 4 cells together:

$$\begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \end{bmatrix} \times \begin{bmatrix} 7 & 8 \\ 9 & 10 \\ 11 & 12 \end{bmatrix} = \begin{bmatrix} 58 & 64 \\ 139 & 154 \end{bmatrix}$$

That's the whole thing. Four dot products, four cells, done.

Visualizing It

Here's what's happening for $C_{11}$. You're sliding row 1 of A across column 1 of B:

The General Formula

For any two compatible matrices $A$ (size $m \times n$) and $B$ (size $n \times p$), the result $C = A \times B$ is an $m \times p$ matrix where each cell is:

$$C_{ij} = \sum_{k=1}^{n} A_{ik} \cdot B_{kj} = A_{i1}B_{1j} + A_{i2}B_{2j} + \cdots + A_{in}B_{nj}$$

In plain English: to get the cell at row $i$, column $j$ of the result, take the dot product of row $i$ from the first matrix and column $j$ from the second matrix.

Complexity

For two $n \times n$ matrices, you compute $n^2$ cells, and each cell requires $n$ multiplications and $n-1$ additions. Total work:

$$O(n^3)$$

Our $2 \times 3$ by $3 \times 2$ example had $2 \times 2 = 4$ cells, each needing 3 multiplications. That's $4 \times 3 = 12$ multiplications total.

For a $1000 \times 1000$ matrix? That's $1000^3 = 1{,}000{,}000{,}000$ multiplications. One billion. This is exactly why GPUs exist: they can do thousands of those dot products in parallel instead of one at a time.

Key Properties

A few things worth knowing about matrix multiplication:

Not commutative: $A \times B \neq B \times A$ in general. Order matters.
Associative: $(A \times B) \times C = A \times (B \times C)$. You can regroup, just don't reorder.
Distributive: $A \times (B + C) = A \times B + A \times C$.
Identity matrix: $A \times I = A$. The identity matrix $I$ has 1s on the diagonal and 0s everywhere else. It's the "multiply by 1" of matrices.

$$I = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix}$$

Why This Matters

Matrix multiplication is the single most important operation in:

AI/ML: Every neural network layer is a matrix multiply. Training GPT-4 is billions of these.
Computer graphics: Every 3D rotation, scaling, and projection is a $4 \times 4$ matrix multiply.
Physics simulations: Solving systems of linear equations boils down to matrix operations.
Signal processing: Fourier transforms can be expressed as matrix multiplications.

If you understand this one operation deeply, you understand the computational core of modern computing.

2026-03-28

Complexity vs. Volume: Why GPUs Rule AI Math

Understanding the fundamental split between CPU and GPU architecture, explained through the lens of performance philosophy.

2025-10-25

Information Theory Basics: Entropy, Compression, and Shannon's Theorem

Shannon entropy, the source coding theorem, compression limits, Huffman coding, and why you can never compress truly random data.

2025-10-24

Randomized Algorithms and Probabilistic Analysis

Las Vegas vs Monte Carlo algorithms, expected running time, randomized quicksort, skip list analysis, and why adding randomness can make algorithms simpler and faster.

Matrix Multiplication from Scratch

A dead-simple, step-by-step walkthrough of how matrix multiplication actually works, with every single calculation shown.

What Is a Matrix?

The One Rule You Need to Know

The Algorithm: Dot Product, Cell by Cell

Full Worked Example: 2x3 times 3x2

Step 1: Compute $C_{11}$

Step 2: Compute $C_{12}$

Step 3: Compute $C_{21}$

Step 4: Compute $C_{22}$

The Final Result

Visualizing It

The General Formula

Complexity

Key Properties

Why This Matters

Read More