Similar to the eigendecomposition method, we can approximate our original matrix \(\mathbf{A}\) by summing the terms which have the highest singular values. So we can use the first \(k\) terms in the SVD equation, using the \(k\) highest singular values:
We know that the set \(\{\mathbf{u}_1, \mathbf{u}_2, \ldots, \mathbf{u}_r\}\) forms a basis for \(\mathbf{A}\mathbf{x}\text{.}\) So when we pick \(k\) vectors from this set, \(\mathbf{A}_k\mathbf{x}\) is written as a linear combination of \(\mathbf{u}_1, \mathbf{u}_2, \ldots, \mathbf{u}_k\text{.}\) So they span \(\mathbf{A}_k\mathbf{x}\) and since they are linearly independent they form a basis for \(\mathbf{A}_k\mathbf{x}\) (or col \(\mathbf{A}_k\)). So the rank of \(\mathbf{A}_k\) is \(k\text{,}\) and by picking the first \(k\) singular values, we approximate \(\mathbf{A}\) with a rank-\(k\) matrix.
SVD is the best way to approximate \(\mathbf{A}\) with a rank-\(k\) matrix. If \(m \times n\) matrix \(\mathbf{A}_k\) is the approximated rank-\(k\) matrix by SVD, then for any \(m \times n\) rank-\(k\) matrix \(\mathbf{B}\text{:}\)