Kronecker Product

Vectorization Kronecker Product Tensor

Vectorization

The vectorization operation, denoted by \(\text{vec }(\cdot)\), takes a matrix and stacks its columns into a single column vector.
For example, let \(x \in \mathbb{R}^n \), and consider outer product \(xx^\top \in \mathbb{R}^{n \times n}\). \[ \begin{align*} \text{vec }(xx^\top) &= \begin{bmatrix}x_1x_1 \\ x_2x_1 \\ \vdots \\ x_nx_1 \\ x_1x_2 \\ x_2x_2 \\ \vdots \\ x_nx_2 \\ \vdots \\ x_1x_n \\ x_2x_n \\ \vdots \\ x_nx_n \end{bmatrix} \\ \\ &= x \otimes x \end{align*} \] The notation \(\otimes\) is called Kronecker product(more generally, tensor product). This operation is widely used in fields like machine learning, statistics, and optimization for representing higher-order interactions or matrix manipulations in a compact form.

Note: Conversely, we can reshape a vector into a matrix:
Consider a vector \[ a = \begin{bmatrix} 1 & 2 & 3 & 4 & 5 & 6 \end{bmatrix}^\top \]

                import numpy as np

                vector = np.array([1, 2, 3, 4, 5, 6])

                #In Python, you can choose these two ways(default is row-major):

                matrix_r = vector.reshape((2, 3), order='C') # C stands for "C" programming language
                matrix_c = vector.reshape((2, 3), order='F') # F stands for "F"ortran programming language
            

Kronecker Product

Let \(A \in \mathbb{R^{m \times n}}\) and \(B \in \mathbb{R^{p \times q}}\). In general, the Kronecker product \(A\otimes B\) is given by \((mp) \times (nq)\) matrix: \[ A\otimes B = \begin{bmatrix} a_{11}B & a_{12}B & \cdots & a_{1n} B \\ a_{21}B & a_{22}B & \cdots & a_{2n} B \\ \vdots & \vdots & \ddots & \vdots \\ a_{m1}B & a_{m2}B & \cdots & a_{mm} B \\ \end{bmatrix}. \] Each element \(a_{ij}\) of \(A\) is multiplied by the entire matrix \(B\) resulting in blocks of size \(p \times q\).
For example, \[ \begin{bmatrix} 1 & 2 \\ 3 & 4 \\ \end{bmatrix} \otimes \begin{bmatrix} 5 & 6 \\ 7 & 8 \\ \end{bmatrix} = \begin{bmatrix} 5 & 6 & 10 & 12 \\ 7 & 8 & 14 & 16 \\ 15 & 18 & 20 & 24 \\ 21 & 24 & 28 & 32 \\ \end{bmatrix}. \]

Useful Properties of the Kronecker Product
  1. Mixed-product property
  2. \[ (A \otimes B)(C \otimes D) = (AC) \otimes (BD) \]
  3. Transpose
  4. \[ (A \otimes B)^\top = A^\top \otimes B^\top \]
  5. Inverse
  6. \[ (A \otimes B)^{-1} = A^{-1} \otimes B^{-1} \]
  7. Trace
  8. \[ \text{Tr }(A \otimes B) = \text{Tr }(A) \text{Tr }(B) \]
  9. Determinant
  10. \[ \det (A \otimes B) = \det(A)^n \det(B)^m \] where \(A \in \mathbb{R}^{m \times m}\), and \(B \in \mathbb{R}^{n \times n}\).
  11. Eigenvalues
  12. Suppose \(A \in \mathbb{R}^{m \times m}\) and \(B \in \mathbb{R}^{n \times n}\). Then \[ \text{Eigenvalues of } A \otimes B = \lambda_i \mu_j \] where \(\lambda_i (i = 1, \cdots m)\) and \(\mu_j (j = 1, \cdots, n)\) are eigenvalues of \(A\) and \(B\) respectively.
Now, we are ready to discuss an important concept in machine learning.

Tensor

A tensor is a generalization of a 2d array to more than 2 dimensions. So far we have seen the following tensors in mathematics.


We can apply this notion to higher-order tensors. For example, the data of images are indeed, 3-dimensional tensor, because real-world images usually include color information, which requires multiple channels. So, for any image \(I\), \[ I \in \mathbb{R}^{H \times W \times C} \] where \(H\) is height, \(W\) is width, and \(C\) is channels(e.g., RGB: \(C = 3\)) of the image.

Formally, a tensor is an element of a tensor product of vector spaces. Consider vector spaces \(V\) and \(W\) over a field \(\mathbb{F}\). The tensor product of \(V\) and \(W\), denoted \(V \otimes W\) is a new vector space(tensor product space) whose elements are linear combinations of tensor products of vectors: \[ v \otimes w \quad \text{where } v \in V, \, w \in W. \] A rank-r tensor, \(T\) is an element of a tensor product of \(r\) vector spaces: \[ T \in V_1 \otimes V_2 \otimes \cdots \otimes V_r. \] In this context,
So, the Kronecker product is just a specific case of the tensor product applied to "matrices" and the tensor product is a more general mathematical operation that applies to multilinear maps.