Eigenvalues & Eigenvectors

Eigenvectors and Eigenvalues Characteristic equations Diagonalization Complex Eigenvalues and Eigenvectors

Eigenvectors and Eigenvalues

An eigenvector of an \(n \times n\) matrix \(A\) is a "nonzero" vector \(x\) such that \[Ax = \lambda x \tag{1}\] for some scalar \(\lambda\), which is called an eigenvalue of \(A\) if there is a nontrivial solution \(x\) to equation (1). Such an \(x\) is referred to as an eigenvector corresponding to \(\lambda\). Equation (1) can be written as: \[(A - \lambda I)x = 0 \tag{2}\] The scalar \(\lambda\) is an eigenvalue of \(A\) iff (2) has a nontrivial solution. The set of all solutions of (2) is the null space \(Nul(A - \lambda I) \subseteq \mathbb{R}^n\) and is called the eigenspace of \(A\) corresponding to the eigenvalue \(\lambda\).

Theorem 1: The eigenvalues of a triangular matrix are its main diagonal entries.
Proof: Suppose \(A \in \mathbb{R}^{3 \times 3} \) is a lower triangular matix, and \(\lambda\) is an eigenvalue of \(A\). Then, \[A - \lambda I = \begin{bmatrix} a_{11} - \lambda & 0 & 0 \\ a_{21} & a_{22} - \lambda & 0 \\ a_{31} & a_{32} & a_{33} - \lambda \\ \end{bmatrix}. \] Since \(\lambda\) is an eigenvalue of \(A\), \((A-\lambda I)x =0\) has a nontrivial solution. In other words, the equation has a free variable. This occurs if and only if at least one of the main diagonal entries is zero so that \(\lambda\) is equal to one of the main diagonal entries of \(A\). The same reasoning applies to upper triangular matrices and higher-dimensional cases.
For example, \[A = \begin{bmatrix} 0 & 1 & 8 \\ 0 & 2 & 7 \\ 0 & 0 & 3 \\ \end{bmatrix}\] has eigenvalues; \(0, 2, \text{and } 3\). Since \(A\) has a zero eigenvalue, equation (1) becomes the homogeneous equation \(Ax =0\), which must have a nontrivil solution. This happens iff \(A\) is a singular matrix(NOT invertible).
We can verify this by computing: \[ det A = 0(6-0)-(0-0)+8(0-0)=0 \] Since \(detA=0\), \(A\) is indeed not invertible.
Theorem 2: If \(v_1, \cdots, v_n\) are eigenvectors corresponding to "distinct" eigenvalues \(\lambda_1, \cdots, \lambda_n\) of a square matrix \(A\), then the set \(\{v_1, \cdots, v_n\}\) is linearly independent.
Proof: Assume the set of eigenvectors \(\{v_1, \cdots, v_n\}\) is linearly "dependent." By definition, each eigenvector \(v_i\) is nonzero. If the set is dependent, at least one eigenvector \(v_{i+1}\) can be written as a linear combination of the preceding eigenvectors. Let \(i\) be the least index so that \(v_{i+1}\) is a linear combination of the preceding eigenvectors. Then there exist scalars \(c_1, \cdots, c_i\) such that \[ v_{i+1} = c_1v_1 + \cdots + c_iv_i, \quad c_1, \cdots, c_i \in \mathbb{R}. \tag{3} \] Multiplying both sides of (3) by a square \(A\), we get: \[ c_1Av_1 + \cdots + c_iAv_i = Av_{i+1} \] By definition of eigenvalues & eigenvectors, (3) becomes \[ c_1 \lambda_1 v_1 + \cdots + c_i \lambda_i v_i = \lambda_{i+1} v_{i+1}. \tag{4} \] Subtracting \(\lambda_{i+1}\) times (3) from (4), we get: \[ c_1(\lambda_1 - \lambda_{i+1}) v_1 + \cdots + c_i(\lambda_i - \lambda_{i+1} )v_i = 0. \tag{5} \] All coefficients in (5) must be zero since \(\{ v_1, \cdots, v_i\}\) is linerarly independent. However, \((\lambda_1 - \lambda_{i+1}), \cdots, (\lambda_i - \lambda_{i+1} )\) are nonzero because the eigenvalues are distinct. Hence, \(c_1 = c_2 = \cdots = c_i = 0\). This is contradicting equation (3) Therefore, the set \(\{v_1, \cdots, v_n\}\) must be linearly independent set.

Characteristic Equations

A scalar \(\lambda\) is an eigenvalue of an \(n \times n\) matrix \(A\) iff \(\lambda\) satisfies the characteristic equation \[det (A - \lambda I) = 0.\] For example, consider the matrix \[A = \begin{bmatrix} 1 & 2 & 3 \\ 0 & 1 & 4 \\ 5 & 6 & 0 \\ \end{bmatrix}\] Then, \[det(A- \lambda I) = (1-\lambda)(-\lambda^2 -\lambda -24) -0 +5(5 +3\lambda) = 0 \] Expanding this, we obtain the characteristic polynomial: \[-\lambda^3 +2\lambda^2 + 38\lambda +1 = 0\] Solving this equation for \(\lambda\), we get eigenvalues for \(A\): \(\lambda \approx -5.230, -0.026, 7.256\). Like this example, in practice, we typically approximate eigenvalues using numerical methods.

The algebraic multiplicity of an eigenvalue is its multiplicity as a root of the characteristic equation.

For example, if the characteristic polynomial of a matrix is \((\lambda -1)^2 (\lambda -2) = 0\), then the eigenvalue 1 has multiplicity 2.

Next, we introduce the concept of similarity of matrices. Suppose \(A\) and \(B\) are \(n \times n\) matrices. Then, \(A\) is said to be similar to \(B\) if there exists an invertible matrix \(P\) such that \[ P^{-1}AP =B \quad \text{, or equivalently, } A=PBP^{-1}. \]

Theorem 3: "If" \(n \times n\) matrices \(A\) and \(B\) are similar, "then" they have the same characteristic polynomial and thus the same eigenvalues with the same multiplicities.
Note: the converse of this theorem is not true. Having the same eigenvalues does not imply the matrices are similar.
Proof: If \(B = P^{-1}AP\), then \[ \begin{align*} B - \lambda I &= P^{-1}AP - \lambda P^{-1}P \\\\ &= P^{-1}(A - \lambda I)P \end{align*} \] By the multiplicative property of determinants, we have \[ \begin{align*} \det (B-\lambda I) &= \det (P^{-1}) \det (A-\lambda I) \det (P)\\\\ &= \det (A-\lambda I). \end{align*} \] Note: \(\det (P^{-1}P) = \det (I) = 1\).

Diagonalization

A square matrix \(A\) is said to be diagonalizable if for some invertible matrix \(P\), \(A\) is similar to a diagonal matrix \(D\). \[A = PDP^{-1}\]

Theorem 4: Diagonalization An \(n \times n\) matrix \(A\) is diagonalizable iff \(A\) has \(n\) linearly independent eigenvectors. Thus, the columns of \(P\) are linearly independent eigenvectors of \(A\) and the diagonal entries of \(D\) are eigenvalues of \(A\) corresponding to the eigenvectors in \(P\). Note: \(P\) and \(D\) are not unique because the order of the diagonal entries in \(D\) can be changed.
Proof: Let \(P\) be a square matrix with columns \(v_1, \cdots, v_n\) and \(D\) be a diagonal matrix with diagonal entries \(\lambda_1, \cdots, \lambda_n \). Then, \[AP = \begin{bmatrix}Av_1 & \cdots & Av_n \\\end{bmatrix}\] \[PD = \begin{bmatrix}\lambda_1 v_1 & \cdots & \lambda_n v_n \\\end{bmatrix}\] Assume \(A\) is diagonalizable, so \(A = PDP^{-1}\). By multiplying both sides of this equation on the right by \(P\), we get: \[AP = AD\]. Thus, for each columns of \(AP\), we have \[ Av_1 = \lambda_1 v_1, \cdots, Av_n = \lambda_n v_n. \tag{1} \] Since \(P\) is invertible, its colums are linearly independent and must be nonzero. From equation (1), \(\lambda_1, \cdots, \lambda_n \) are eigenvalues, and \(v_1, \cdots, v_n\) are corresponding eigenvectors.

Finally, consider any \(n\) eigenvectors \(v_1, \cdots, v_n\). We can construct \(P\) and \(D\) from from these eigenvectors and their corresponding eigenvalues \(\lambda_1, \dots, \lambda_n\). If the eigenvectors are linearly independent, then \(P\) is invertible, and we obtain \(A = P D P^{-1}\).
For example, given \[A = \begin{bmatrix} 4 & 1 & 1\\ 1 & 4 & 1 \\ 1 & 1 & 4 \\ \end{bmatrix},\] computing the characteristic equation: \[det(A- \lambda I) = (4-\lambda)((4-\lambda)^2 - 1) -((4-\lambda )-1)+(1-(4 -\lambda)) = 0 \] Simplifying this, we get: \[-\lambda^3 +12\lambda^2 -45\lambda -54 = 0,\] which factors as \[(\lambda - 3)^2(\lambda -6) = 0\] Thus, eigenvalues are \(\lambda_1 = 3\), \(\lambda_2 = 3\), and \(\lambda_3 = 6\). Next, we need to find three linearly independent eigenvectors corresponding to each eigenvalue.
For \(\lambda_1 = 3\) and \(\lambda_2 = 3\): \[A-3I = \begin{bmatrix} 1 & 1 & 1\\ 1 & 1 & 1 \\ 1 & 1 & 1 \\ \end{bmatrix} \xrightarrow{\text{rref}} \begin{bmatrix} 1 & 1 & 1 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \\\end{bmatrix} \] We choose the eigenvectors \(v_1 = \begin{bmatrix} -1 \\ 0 \\ 1 \\ \end{bmatrix}\) \(\lambda_1 = 3\) \(v_2 = \begin{bmatrix} -1 \\ 1 \\ 0 \\ \end{bmatrix}\)
For \(\lambda_3 = 6\): \[ A-6I = \begin{bmatrix} -2 & 1 & 1\\ 1 & -2 & 1 \\ 1 & 1 & -2 \\ \end{bmatrix} \xrightarrow{\text{rref}} \begin{bmatrix} 1 & 0 & -1 \\ 0 & 1 & -1 \\ 0 & 0 & 0 \\\end{bmatrix} \] We choose the eigenvector \(v_3 = \begin{bmatrix} 1 \\ 1 \\ 1 \\ \end{bmatrix}\).
Therefore, \[ \begin{align*} &D = \begin{bmatrix} 3 & 0 & 0 \\ 0 & 3 & 0 \\ 0 & 0 & 6 \\ \end{bmatrix}, \\\\ &P = \begin{bmatrix} -1 & -1 & 1 \\ 0 & 1 & 1 \\ 1 & 0 & 1 \\ \end{bmatrix}, \\\\ &P^{-1} = \begin{bmatrix} \frac{-1}{3} & \frac{-1}{3} & \frac{2}{3} \\ \frac{-1}{3} & \frac{2}{3} & \frac{-1}{3}\\ \frac{1}{3} & \frac{1}{3} & \frac{1}{3} \\ \end{bmatrix}. \end{align*} \]

As you can see, It is "not necessary" that an \(n \times n\) matrix has \(n\) "distinct" eigenvalues in order to be diagonalizable (Note: if the matrix has \(n\) distinct eigenvalues, the matrix must be diagonalizable.)

Complex Eigenvalues and Eigenvectors

It is possible for the characteristic equation to have complex roots. A complex scalar \(\lambda\) satisifes the characteristic equation \(det(A -\lambda I) = 0\) iff there is a nonzero vector \(x \in \mathbb{C}^n\) such that \(Ax = \lambda x\). In this case, \(\lambda\) is called a complex eigenvalue and \(x\) is its corresponding complex eigenvector.

Rotation matrices are widely used in applications such as computer graphics to represent rotations in 2D or higher-dimensional spaces. Analyzing their eigenvalues and eigenvectors can provide insight into their geometric behavior, including scaling and rotation angles.
In general, rotation matrices are not always diagonalizable in \(\mathbb{R}^n\) but in \(\mathbb{C}^n\), real rotation matrices are always diagonalizable. (See Orthogonality and Symmetry .)

Consider the 2D rotation matrix \(R = \begin{bmatrix} a & -b \\ b & a \end{bmatrix}\), where \(a\) and \(b\) are real and not both nonzero. Then the "complex" eigenvalues of \(R\) can be found by solving the characteristic equation: \[ \begin{align*} \det (R - \lambda I)=0 &\Longrightarrow \lambda^2 -2a\lambda +(a^2 + b^2) = 0 \\\\ &\Longrightarrow (\lambda -(a+bi))(\lambda -(a-bi)) = 0 \end{align*} \] Thus, we get complex eigenvalues \(\lambda = a \pm bi\), and we can find comlex eigenvectors: \[ Rv_1 = \begin{bmatrix} a & -b \\ b & a \\ \end{bmatrix} \begin{bmatrix} 1 \\ -i \\ \end{bmatrix} = \begin{bmatrix} a + bi \\ b - ai\\ \end{bmatrix} = (a+bi)\begin{bmatrix} 1 \\ -i \\ \end{bmatrix} \] Hence, \(v_1 = \begin{bmatrix} 1 \\ -i \\ \end{bmatrix}\) is an eigenvector corresponding to the eigenvalue \(\lambda = a + bi\). Moreover, the complex conjugate of \(\lambda\), denoted \(\bar{\lambda} = a - bi\), is an eigenvalue with its corresponding eigenvector \(v_2 = \begin{bmatrix} 1 \\ i \\ \end{bmatrix}\). Therefore, we can diagonalize \(R\) in \(\mathbb{C}^2 \,\): \[ \begin{align*} R &= PDP^{-1} \\\\ &= \begin{bmatrix} 1 & 1 \\ -i & i\end{bmatrix} \begin{bmatrix} a+bi & 0 \\ 0 & a-bi \end{bmatrix} \frac{1}{2}\begin{bmatrix} 1 & i \\ 1 & -i\end{bmatrix} \end{align*} \]
Now, we can map this representation back to \(\mathbb{R}^2\) as the polar decomposition of \(R\) into its rotation angle \(\varphi\) and scaling factor \(r\).
Let \(r = \sqrt{a^2 + b^2}\) be the magnitude of \(\lambda\), so \(|\lambda | = r \) and \(\varphi\) be the angle(argument of \(\|\lambda \| = r\)) such that \[\cos \varphi = \frac{a}{r}, \quad \sin \varphi = \frac{b}{r}.\] Then \(R\) can be written in polar form as: \[ \begin{align*} R &= r \begin{bmatrix} \frac{a}{r} & \frac{-b}{r} \\ \frac{b}{r} & \frac{a}{r} \\ \end{bmatrix} \\\\ &= \begin{bmatrix} r & 0 \\ 0 & r \\ \end{bmatrix} \begin{bmatrix} \cos \varphi & -\sin \varphi \\ \sin \varphi & \cos \varphi \\ \end{bmatrix} \end{align*} \]

Note: In general, given a nonzero complex number \(z\) corresponds to a point \((a, b)\) in the complex plane, then \( \, a = |z| \cos \varphi , \quad b = |z| \sin \varphi \, \) and so \[z = a + bi = |z| (\cos \varphi + i\sin \varphi ) = |z|e^{i\varphi} \] where \(\varphi = \arg z \, \) and \(|z| = \sqrt{a^2 + b^2}\) because \(z\cdot \bar{z} = a^2 + b^2\).