Symmetry

Symmetric Matrices

A symmetric matrix is a square matrix \(A\) such that \[A^T= A\] For example, in the context of least-squares solutions, \(\hat{\beta}\) satisfies \(X^TX\beta = X^TY\). In this expression, \(X^TX\) is always symmetric. To confirm, let \(X\) be an \(m \times n\) matrix. Then \[(X^TX)^T = X^T(X^T)^T = X^TX.\]
An \(n \times n\) matrix \(A\) is said to be orthogonally diagonalizable if there exists an orthogonal matrix \(P\) and a diagonal matrix \(D\) such that: \[ A = PDP^T = PDP^{-1} \] For this diagonalization, \(A\) must have \(n\) linearly independent and orthonormal eigenvectors. Importantly, in such cases, \(A\) is symmetric because \[ \begin{align*} A^T = (PDP^T)^T &= (P^T)^T D^T P^T \\\\ &= PDP^T \\\\ &= A. \end{align*} \] The converse is also always true. If \(A\) is symmetric, it is guaranteed to be orthogonally diagonalizable.

Theorem 1: An \(n \times n\) matrix \(A\) is orthogonally diagonalizable if and only if \(A\) is a symmetric matrix.

On Eigenvalues & Eigenvectors, we considered the example: \[ \begin{align*} A &= \begin{bmatrix} 4 & 1 & 1\\ 1 & 4 & 1 \\ 1 & 1 & 4 \\ \end{bmatrix} \\\\ &= PDP^{-1} \\\\ &= \begin{bmatrix} -1 & -1 & 1 \\ 0 & 1 & 1 \\ 1 & 0 & 1 \\ \end{bmatrix} \begin{bmatrix} 3 & 0 & 0 \\ 0 & 3 & 0 \\ 0 & 0 & 6 \\ \end{bmatrix} \begin{bmatrix} \frac{-1}{3} & \frac{-1}{3} & \frac{2}{3} \\ \frac{-1}{3} & \frac{2}{3} & \frac{-1}{3} \\ \frac{1}{3} & \frac{1}{3} & \frac{1}{3} \end{bmatrix}. \end{align*} \] Since, \(A\) is symmetric, it must be orthogonally diagonalizable. To ensure \(P\) is orthonormal, we need orthonormal eigenvectors. This can be achieved using the Gram-Schmidt algorithm on the columns of \(P\). Since only \(v_1 \cdot v_2 \neq = 0\) needs adjustment, \[ \begin{align*} v'_2 &= v_2 - \frac{v_2 \cdot v_1}{v_1 \cdot v_1}v_1 \\\\ &= \begin{bmatrix} -1\\ 1\\ 0\\ \end{bmatrix} - \frac{1}{2}\begin{bmatrix} -1\\ 0\\ 1\\ \end{bmatrix} \\\\ &= \begin{bmatrix} \frac{-1}{2}\\ 1 \\ \frac{-1}{2} \end{bmatrix}. \end{align*} \] After normalization, the orthonormal eigenvectors are: \[ u_1 = \begin{bmatrix} \frac{-1}{\sqrt{2}} \\ 0\\ \frac{1}{\sqrt{2}} \\ \end{bmatrix} ,\, u_2 = \begin{bmatrix} \frac{-1}{\sqrt{6}}\\ \frac{2}{\sqrt{6}}\\ \frac{-1}{\sqrt{6}}\\ \end{bmatrix}, \, \text{and } u_3 = \begin{bmatrix} \frac{1}{\sqrt{3}} \\ \frac{1}{\sqrt{3}} \\ \frac{1}{\sqrt{3}} \\ \end{bmatrix} \] Therefore, \[ \begin{align*} A &= PDP^T \\\\ &= \begin{bmatrix} \frac{-1}{\sqrt{2}} & \frac{-1}{\sqrt{6}} & \frac{1}{\sqrt{3}} \\ 0 & \frac{2}{\sqrt{6}} & \frac{1}{\sqrt{3}}\\ \frac{1}{\sqrt{2}} & \frac{-1}{\sqrt{6}} & \frac{1}{\sqrt{3}} \end{bmatrix} \begin{bmatrix} 3 & 0 & 0 \\ 0 & 3 & 0 \\ 0 & 0 & 6 \\ \end{bmatrix} \begin{bmatrix} \frac{-1}{\sqrt{2}} & 0 & \frac{1}{\sqrt{2}} \\ \frac{-1}{\sqrt{6}} & \frac{2}{\sqrt{6}} & \frac{-1}{\sqrt{6}} \\ \frac{1}{\sqrt{3}} & \frac{1}{\sqrt{3}} & \frac{1}{\sqrt{3}} \end{bmatrix} \\\\ &= PDP^{-1} \end{align*} \]

Quadratic Forms

A quadratic form on \(\mathbb{R}^n\) is a function \[ Q(x) = x^TAx \, \in \mathbb{R} \] where \(A\) is an \(n\times n\) symmetric matrix.

Let's use our symmetric matrix again: \[ A = \begin{bmatrix} 4 & 1 & 1\\ 1 & 4 & 1 \\ 1 & 1 & 4 \\ \end{bmatrix} \] Then: \[ \begin{align*} Q(x) &= x^TAx \\\\ &= 4x_1^2 +4 x_2^2 + 4x_3^2 + 2x_1x_2 + 2x_2x_3 + 2x_3x_1 \end{align*} \]
We can transform this quadratic form into one with no cross-product terms.
Let \(x = Py\) be a change of variable where \(P\) is an invertible matrix and \(y\) is a new variable in \(\mathbb{R}^n\). Since \(A\) is symmetric, by Theorem 1, \(A\) is orthogonally diagonalizable. Thus \[ \begin{align*} x^TAx &= (Py)^TA(Py) \\\\ &= y^TP^TAPy \\\\ &= y^T(P^TAP)y \\\\ &= y^T(P^TPDP^TP)y \\\\ &= y^TDy \end{align*} \] where \(y = P^{-1}x = P^Tx\).
Since \(P^T A P = D\), where \(D\) is a diagonal matrix, we get: \[ \begin{align*} x^T A x &= y^T D y \\\\ &= \lambda_1 y_1^2 + \lambda_2 y_2^2 + \cdots + \lambda_n y_n^2 \end{align*} \] where \(\lambda_1, \lambda_2, \ldots, \lambda_n\) are the eigenvalues of \(A\).
For our example, \(Q(x)\) becomes: \[ 3y_1^2 + 3y_2^2 + 6 y_3^2 \]
Thus, the eigenvalues of \(A\) are directly connected to its quadratic form. First, we define the classification of quadratic forms.

A quadratic form \(Q(x)\) is classified as:

Positive definite:
\(\qquad \qquad A \succ 0\) if \(\, \forall x \neq 0, \, Q(x) > 0\).
Positive semi-definite:
\(\qquad \qquad A \succeq 0\) if \(\, \forall x, \, Q(x) \geq 0\).
Negative definite:
\(\qquad \qquad A \prec 0\) if \(\, \forall x \neq 0, \, Q(x) < 0\).
Negative semi-definite:
\(\qquad \qquad A \preceq 0\) if \(\, \forall x, \, Q(x) \leq 0\).
Indefinite:
\(\qquad \qquad\) otherwise.

Theorem 2: Let \(A\) be an \(n \times n\) symmetric matrix. Then a quadratic form \(x^TAx\) is

positive definite iff the eigenvalues of \(A\) are all positive.
negative definite iff the eigenvalues of \(A\) are all negative.

As mentined above, an orthogonal change of variable \(x= Py\) transforms \(x^TAx\) into \[ Q(x) = y^T D y = \lambda_1 y_1^2 + \cdots + \lambda_n y_n^2. \] where \(\lambda_1, \ldots, \lambda_n\) are the eigenvalues of \(A\).
There is a one-to-one correspondence between all \(x \neq 0\) and \(y \neq 0\) because \(P\) is an invertible matrix. So, \(Q(x)\) is determined by the signs of eigenvalues(or spectrum) of \(A\). The columns of \(P\) are called the principal axes of the quadratic form \(x^TAx\).

Note: To check whether a matrix \(A\) is positive definite, it is effective to verify the existence of a Cholesky factorization \(A = R^TR\) where \(R\) is an upper triangular matrix with positive diagonal entries. This factorization is a modified version of the LU factorization.

Singular Value Decomposition(SVD)

Let \(A\) be an \(m \times n\) matrix. The symmetric matrix \(A^TA\) is orthogonally diagonalizable. Let \(\{v_1, \cdots, v_n\}\) be an orthonormal basis for \(\mathbb{R}^n\) consisting of eigenvectors of \(A^TA\) with corresponding eigenvalues \(\lambda_1, \cdots, \lambda_n\).
For \(1 \leq i \leq n\), the following holds: \[ \begin{align*} \| Av_i \|^2 &= (Av_i)^TAv_i = v_i^T(A^TAv_i) \\\\ &= v_i^T(\lambda_iv_i) \\\\ &= \lambda_i. \end{align*} \] Since \(\| Av_i \|^2 \geq 0\), we conclude that \(\lambda_i \geq 0 \).

We assume the eigenvalues are arranged in descending order: \[ \lambda_1 \geq \lambda_2 \geq \cdots \geq \lambda_n \]
For \(1 \leq i \leq n\), the singular values of \(A\) are defined as \[ \sigma_i = \sqrt{\lambda_i} = \| Av_i \| \geq 0. \] If \(A\) has \(r\) nonzero singular values, then \(\{Av_1, \cdots, Av_r\}\) forms an orthogonal basis for \(\text{Col }A\) and \[\text{rank } A = \dim \text{ Col }A = r. \]

Theorem 3: Singular Value Decomposition Let \(A\) be an \(m \times n\) matrix with rank \(r\). Then there exist an \(m \times n\) matrix \(\Sigma = \begin{bmatrix} D & 0 \\ 0 & 0 \\ \end{bmatrix}\) where \(D\) is a \(r \times r\) diagonal matrix with the first \(r\) nonzero singular values of \(A\), \(\sigma_1 \geq \sigma_2 \geq \cdots \geq \sigma_r > 0\) and there exist an \(m \times m\) orthogonal matrix \(U\), and an \(n \times n\) orthogonal matrix \(V\) such that \[A = U\Sigma V^T\]

Proof: Normalize orthogonal basis for \(\text{Col }A\) \(\{Av_1, \cdots, Av_r\}\), then we obtain an orthonormal basis \(\{u_i, \cdots, u_r\}\), where \[ u_i = \frac{1}{\| Av_i \|}Av_i = \frac{1}{\sigma_i}Av_i \quad 1 \leq i \leq r \] or, \[ Av_i = \sigma_i u_i \tag{1} \] We extend \(\{u_i, \cdots, u_r\}\) to an orthonormal basis for \(\mathbb{R}^m\), \(\{u_i, \cdots, u_r, u_{r+1}, \cdots, u_m\}\).
Let \(U = \begin{bmatrix}u_1 & \cdots & u_m \end{bmatrix}\) and \(V = \begin{bmatrix}v_1 & \cdots & v_n \end{bmatrix}\).
Clearly, by construction, \(U\) and \(V\) are orthogonal matrices. Also, by Equation (1), \[AV = \begin{bmatrix} \sigma_1u_1 & \cdots & \sigma_ru_r & 0 & \cdots & 0 \end{bmatrix}.\] Let \(D\) be a diagonal matrix with diagonal entries \(\sigma_1, \cdots, \sigma_r\) and let \(\Sigma =\begin{bmatrix} D & 0 \\ 0 & 0 \\ \end{bmatrix}\). Then \[ \begin{align*} U\Sigma &= \begin{bmatrix}\sigma_1u_1 & \cdots & \sigma_ru_r & 0 & \cdots & 0\end{bmatrix} \\\\ &= AV \end{align*} \] Finally, since \(V\) is an orthogonal matrix, \[(U\Sigma ) V^T = (AV)V^T = AI = A.\]

Steps for SVD:

Compute Orthogonal decomposition of \(A^TA\):

Construct \(V\) and \(\Sigma\):

Construct \(U\):

Interactive SVD Visualization

This interactive demo visualizes the Singular Value Decomposition process for 2×2 matrices. SVD expresses any matrix A as a product A = UΣV^T, which can be understood geometrically as a composition of three transformations:

V^T: Rotation/reflection using right singular vectors (orthogonal)
Σ: Scaling along the axes (singular values)
U: Rotation/reflection using left singular vectors (orthogonal)

Use the visualization to see how each component of SVD contributes to the overall transformation:

Try different matrices to observe how SVD works with various transformations:

Identity: No transformation occurs
Scaling: Only the singular values differ from 1
Rotation: U and V are identical rotation matrices
Symmetric: U = V (special case for symmetric matrices)
Shear: Combination of rotation and non-uniform scaling

SVD with Fundamental Subspaces

Consider an \(m \times n\) matrix \(A = U\Sigma V^T\).
Let \(u_1, \cdots, u_m\) be the left singular vectors, \(v_1, \cdots, v_n\) be the right singular vectors, \(\sigma_1, \cdots, \sigma_n\) be the singular values, and \(r\) be the rank of \(A\).

\(\{Av_1, \cdots, Av_r\} = \{\sigma_1u_1, \cdots, \sigma_ru_r\} \) is an orthogonal basis for \(\text{Col }A\) and then \(\{u_1, \cdots, u_r\}\) is an orthonormal basis for \(\text{Col }A\). Also, \((\text{Col }A)^{\perp} = \text{Nul }A^T \). Thus, \(\{u_{r+1}, \cdots, u_m\}\) is an orthonormal basis for \(\text{Nul }A^T\).

Since \(\| Av_i \| = \sigma_i = 0\) iff \(i > r \,\), \(v_{r+1}, \cdots, v_n\) span a subspace of \(\text{Nul }A\) of dimension \(n-r\), but we know that \(\dim \text{Nul }A = n - \text{rank } A = n -r\). Therefore, \(\{v_{r+1}, \cdots, v_n\}\) is an orthonormal basis for \(\text{Nul }A\).

Finally, since \((\text{Nul }A)^{\perp} = \text{Col }A^T = \text{Row }A \), \(\{v_1, \cdots, v_r\}\) is an orthonormal basis for \(\text{Row }A\).

Now, we assume \(A\) is an \(n \times n\) matrix. From above facts, we derive additional invertible matrix theorems (See: invertible matrix):

\((\text{Col }A)^{\perp} = \{0\}\).
\((\text{Nul }A)^{\perp} = \mathbb{R}^n\).
\(\text{Row }A = \mathbb{R}^n\).
\(A\) has \(n\) nonzero singular values.

Linear Algebra to Algebraic Foundations

Symmetric Matrices

Quadratic Forms

Singular Value Decomposition(SVD)

Interactive SVD Visualization

SVD with Fundamental Subspaces