Linear Equations

System of Linear Equations The Matrix Equation Linear Independence Nonhomogeneous System vs Homogeneous System

System of Linear Equations

A linear equation can be expressed in the form \[a_1x_1 + a_2x_2 + \cdots + a_nx_n = b\] where \(x_1 \cdots x_n \) are variables and \(b\) and the coefficients \(a_1 \cdots a_n\) are real or complex numbers. In practice, \(n\) would be more than thousands or more.
We refer to a collection of linear equations involving the same variables as a system of linear equations. A system is called consistent if it has either exactly one solution or infinitely many solutions Also, if a system of linear equations has no solution, it is considered inconsistent.

We want to store the essential information of a system such as the values of each coefficient, the number of equations, and the number of variables into a rectangular array called a matrix.
For example, consider the linear system: \[ \begin{align} x_1\phantom{+a_20x_2}-3x_3 + \phantom{0}x_4 &= 10 \\ 4x_1 -5x_2 + 6x_3 -2x_4 &= -8 \\ \phantom{a_3x_1+} 2x_2 + 2x_3 +5x_4 &=6 \end{align} \] This syatem can be represented as the coefficient matrix: \[ \begin{bmatrix} 1 & 0 & -3 & 1\\ 4 & -5 & 6 & -2\\ 0 & 2 & 2 & 5 \end{bmatrix} \] This is a \(3 \times 4\) (read"3 by 4") matrix. Note: The number of rows always comes first.
Alternatively, we can represent it as the augmented matrix: \[ \begin{bmatrix} 1 & 0 & -3 & 1 & 10\\ 4 & -5 & 6 & -2 & -8\\ 0 & 2 & 2 & 5 & 6 \end{bmatrix} \] This is a \(3 \times 5\) matrix which includes the constants from the right-hand side of the equations. Here, the number of rows is 3, and the number of columns is 5.

How can we solve the linear system? The key idea is to replace the system with an equivalent system that is easier to solve. What does "equivalent" mean here? Two matrices are considered row equivalent if there is a sequence of elementary row operations that transforms one matrix into the other. The three elementary row operations are as follows.

  1. Replacing one row by the sum of itself and a multiple of another row.
  2. Swapping two rows.
  3. Multiplying all entries in a row by a nonzero scalar.

First, we wnat to transform our augmented matrix into row echelon form, which satisfies following conditions:
  1. All nonzero rows are above any rows of all zeros.
  2. The first nonzero entry in each row appears further to the right than the first nonzero entry in the row directly above it.
  3. All entries in a column below a leading entry are zero.

We apply elementary row operations to the matrix: \[ \begin{align*} &\begin{bmatrix} 1 & 0 & -3 & 1 & 10 \\ 4 & -5 & 6 & -2 & -8 \\ 0 & 2 & 2 & 5 & 6 \end{bmatrix} \\\\ &\begin{bmatrix} 1 & 0 & -3 & 1 & 10 \\ 0 & -5 & 18 & -6 & -48 \\ 0 & 2 & 2 & 5 & 6 \end{bmatrix} \quad (\text{Row2 $\leftrightarrow$ Row2 - 4*Row1}) \\\\ &\begin{bmatrix} 1 & 0 & -3 & 1 & 10 \\ 0 & -5 & 18 & -6 & -48 \\ 0 & 0 & \frac{46}{5} & \frac{13}{5} & \frac{-66}{5} \end{bmatrix} \quad (\text{Row3 $\leftrightarrow$ Row3 + (2/5)*Row2}) \end{align*} \]
The matrix is now in row echelon form. Next, we transform it into reduced row echelon form(RREF). A matrix in row echelon form is RREF if it satisfies the following conditions additionally:
  1. The leading entry in each nonzero row is 1.
  2. Each leading 1 is the only nonzero entry in its column.
\[ \begin{align*} &\begin{bmatrix} 1 & 0 & -3 & 1 & 10\\ 0 & 1 & \frac{-18}{5} & \frac{6}{5} & \frac{48}{5}\\ 0 & 0 & \frac{46}{5} & \frac{13}{5}& \frac{-66}{5}\\ \end{bmatrix} \quad (\text{Row2 $\leftrightarrow$ (Row2 / -5)}) \\\\ &\begin{bmatrix} 1 & 0 & -3 & 1 & 10\\ 0 & 1 & \frac{-18}{5} & \frac{6}{5} & \frac{48}{5}\\ 0 & 0 & 1 & \frac{13}{46}& \frac{-33}{23}\\ \end{bmatrix} \quad (\text{Row3 $\leftrightarrow$ (Row3 * 5/46)}) \\\\ &\begin{bmatrix} 1 & 0 & 0 & \frac{85}{46} & \frac{131}{23}\\ 0 & 1 & \frac{-18}{5} & \frac{6}{5} & \frac{48}{5}\\ 0 & 0 & 1 & \frac{13}{46}& \frac{-33}{23}\\ \end{bmatrix} \quad (\text{Row1 $\leftrightarrow$ (Row1 + 3*Row3)}) \\\\ &\begin{bmatrix} 1 & 0 & 0 & \frac{85}{46} & \frac{131}{23}\\ 0 & 1 & 0 & \frac{51}{23} & \frac{102}{23}\\ 0 & 0 & 1 & \frac{13}{46}& \frac{-33}{23}\\ \end{bmatrix} \quad (\text{Row2 $\leftrightarrow$ (Row2 + 18/5*Row3)}) \end{align*} \]
Finally, we get the solution of the linear system as follows. \[ \begin{align} x_1 &= -\frac{85}{46}x_4 + \frac{131}{23}\\\\ x_2 &= -\frac{51}{23}x_4 + \frac{102}{23}\\\\ x_3 &= -\frac{13}{46}x_4 - \frac{33}{23}\\\\ x_4 & \text{ is a free variable.} \end{align} \]
In this system, any solution is determined by a choice of the free variable \(x_4\).

The Matrix Equation

To get a new perspective on the system of linear equations, we would like to describe it using the concept of vectors.
Given vectors \(v_1, v_2,\cdots, v_p \in \mathbb{R}^n \) and scalars \(c_1, c_2,\cdots, c_p\), the vector \(y\) defined as \[y = c_1v_1 + c_2v_2 + \cdots + c_pv_p\] which is called a linear combination of \(v_1, v_2,\cdots, v_p\) with weights \(c_1, c_2,\cdots, c_p\).

Let's reconsider the augmented matrix \[\begin{bmatrix} 1 & 0 & -3 & 1 & 10\\ 4 & -5 & 6 & -2 & -8\\ 0 & 2 & 2 & 5 & 6 \end{bmatrix}.\] Define its column vectors: \(a_1 = \begin{bmatrix} 1 \\ 4 \\ 0 \end{bmatrix}, a_2 = \begin{bmatrix} 0 \\ -5 \\ 2 \end{bmatrix}, a_3 = \begin{bmatrix} -3 \\ 6 \\ 2 \end{bmatrix}, a_4 = \begin{bmatrix} 1 \\ -2 \\ 5 \end{bmatrix}\), and \(b =\begin{bmatrix} 10 \\ -8 \\ 6 \end{bmatrix} \in \mathbb{R}^3\).

Then \(b\) is generated("spanned") by a linear combination of \(a_1, \cdots, a_4\) because there exist weights \(x_1, \cdots, x_4\) s.t. \[x_1a_1 + x_2a_2 + x_3a_3 + x_4a_4 = b\] This is equivalent to the solution set we found earlier using the augmented matrix. In other words, the vector \(b\) lies in the span of \(\{a_1, a_2, a_3, a_4\}\), which is a subset of \( \mathbb{R}^3\). \[b \in \text{Span } \{a_1, a_2, a_3, a_4\}\]
Here, we reinterpret the linear combination of vectors as the product of a matrix \(A \in \mathbb{R}^{3\times4}\) and a vector \(x\in\mathbb{R}^4\). \[ Ax = \begin{bmatrix} a_1 & a_2 & a_3 & a_4 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \\ x_4 \end{bmatrix} = x_1a_1 + x_2a_2 + x_3a_3 + x_4a_4 = b \] Hence, the matrix equation \(Ax = b\) has a solution if and only if \(b\) is a linear combination of the columns of \(A\).

Linear Independence

A set of vectors \(\{v_1, v_2, \cdots, v_p\}\in \mathbb{R}^n \) is called linearly independent if \[x_1v_1 + x_2v_2 + \cdots + x_pv_p = 0\] has "only" the trivial solution(the zero vector in \(\mathbb{R}^n\)).

Consider a matrix \(A\) with columns \(\{v_1, v_2, \cdots, v_p\}\). The matrix equation \(Ax =0\) is called the homogeneous equation, and it always has at least one solution: the trivial solution \(x=0\). By definition, the columns of \(A\) are linearly independent if and only if \(Ax =0\) has "only" the trivial solution.

On the other hand, the equation \(Ax =0\) has a nontrivial solution if and only if the columns of \(A\) are linearly dependent. When are the columns of \(A\) linearly dependent? Remember, if a linear system has a free variable, basic variables "depend" on the choice of the free variable. Thus, the homogeneous equation \(Ax =0\) has a nontrivial solution if and only if the equation has at least one free variable, which means the columns of \(A\) are linearly dependent.

Warning:
Consider the vectors \[ u = \begin{bmatrix} 1 \\ 2 \\ 3 \end{bmatrix} \qquad v = \begin{bmatrix} 2 \\ 4 \\ 6 \end{bmatrix} \qquad w = \begin{bmatrix} 7 \\ -9 \\ 7 \end{bmatrix} \] The set \(\{u, v, w\}\) is a linearly "dependent" set because \(v = 2u\), but the vector \(w\) is NOT a linear combination of \(u\) and \(v\). In other words, \(w \notin Span\{u, v\}\). Even in a linearly dependent set, some vectors might not be expressible as a combination of the others.

Nonhomogeneous System vs Homogeneous System

You might wonder about the relationship between a homogeneous equation \(Ax = 0\) and a nonhomogeneous equation \(Ax = b\), where \(b\) is a nonzero vector. Indeed, there is an important connection between them. Let's revisit the solution of our nonhomogeneous system: \[\begin{align} x_1 &= -\frac{85}{46}x_4 + \frac{131}{23}\\\\ x_2 &= -\frac{51}{23}x_4 + \frac{102}{23}\\\\ x_3 &= -\frac{13}{46}x_4 - \frac{33}{23}\\\\ x_4 & \text{ is a free variable.} \end{align}\] As a "vector", the general solution of \(Ax = b\) can be written in parametric vector form as: \[ x = \begin{bmatrix} x_1 \\ x_2 \\ x_3 \\ x_4 \end{bmatrix} = \begin{bmatrix} \frac{131}{23}\\ \frac{102}{23}\\ - \frac{33}{23}\\ 0 \end{bmatrix} + x_4\begin{bmatrix} -\frac{85}{46}\\ -\frac{51}{23} \\ -\frac{13}{46}\\ 1 \end{bmatrix} = p + x_4v \] where \(p\) is a particular solution to \(Ax = b\) (when \(x_4 =0\)), and \(x_4v\) represents the general solution to the corresponding homogeneous system \(Ax = 0\).
Let's solve the homogeneous system \(Ax = 0\) using the same matrix \(A\). \[ \begin{bmatrix} 1 & 0 & -3 & 1 & 0\\ 4 & -5 & 6 & -2 & 0\\ 0 & 2 & 2 & 5 & 0 \end{bmatrix} \xrightarrow{\text{rref}} \begin{bmatrix} 1 & 0 & 0 & \frac{85}{46} & 0\\ 0 & 1 & 0 & \frac{51}{23} & 0\\ 0 & 0 & 1 & \frac{13}{46}& 0 \end{bmatrix} \]
Then the solution is \[ \begin{align} x_1 &= -\frac{85}{46}x_4 \\\\ x_2 &= -\frac{51}{23}x_4 \\\\ x_3 &= -\frac{13}{46}x_4 \\\\ x_4 & \text{ is a free variable.} \end{align} \] In parametric vector form, this solution becomes: \[x = x_4\begin{bmatrix} -\frac{85}{46}\\ -\frac{51}{23} \\ -\frac{13}{46}\\ 1 \end{bmatrix} = x_4v \] This is the general solution to \(Ax = 0\) where \(p = 0\).
So, the solutions of the nonhomogeneous system \(Ax =b\) can be obtained by adding the particular solution \(p\) to the general solution of the homogeneous system \(Ax =0\). Geometrically, you can imagine that the solution set of \(Ax = 0\) and \(Ax = b\) are "parallel." This relationship leads us to an important result:

Theorem 1: Suppose the system \(Ax = b\) is consistent for some vector \(b\). Let \(p\) be a solution of the system. Then the solution set of \(Ax = b\) consists of the set of all vectors of the form \(w = p + v_h\), where \(v_h\) is any solution of the homogeneous system \(Ax = 0\).
Proof: Let \(p\) be a particular solution of the nonhomogeneous system \(Ax = b\) and let \(v_h\) be any solution of the homogeneous system \(Ax = 0\). Also, let \(w = p + v_h\). \[ Aw = A(p+v_h) = Ap + Av_h = b + 0 = b \] Thus \(w = p + v_h\) is a solution of \(Ax = b\).
Also, let \(w\) be any solution of \(Ax = b\), and define \(v_h = w -p\). Then \[ Av_h = A(w-p) = Aw - Ap = b -b = 0.\] Thus \(v_h = w -p\) satisfies \(Ax = 0\).
Therefore, every solution of \(Ax = b\) has the form \(w = p + v_h\).
This theorem is not only crucial in understanding linear systems but is also highly useful in solving differential equations.