Riemann Integration

We've encountered many integrals in statistics, and you might have noticed that some of them differ from the ones we learned in early calculus classes. While we often rely on numerical methods to compute these integrals, gaining a deeper understanding of their foundations can provide valuable insights. To begin, let's quickly review one of the most familiar types of integrals: the Riemann Integral.

Let \(f: [a, b] \to \mathbb{R} \) be a bounded function defined on a closed interval \([a, b]\) where \(a, b \in \mathbb{R}\).

A partition \(\mathcal{P}\) of \([a, b]\) is a finite sequence of points: \[ \mathcal{P} = \{x_0, x_1, x_2, \cdots, x_n\} \] where \(a = x_0 < x_1 < x_2 < \cdots < x_n = b\) and \(n \in \mathbb{N}\).

The collection of all subintervals that are formed by \(\mathcal{P}\) are given by: \[ [x_{i-1}, x_i] \quad \text{for } i = 1, 2, \cdots, n. \]
For each subinterval, we denote the infimum (greatest lower bound) and the supremum (least upper bound) of \(f\) as follows respecively: \[ \begin{align*} &m_i = \inf_{x\in [x_{i-1}, x_i]} f(x), \\\\ &M_i = \sup_{x\in [x_{i-1}, x_i]} f(x). \end{align*} \]
Here, we define the lower sum and the upper sum as follows respectively: \[ \begin{align*} &L(f, \mathcal{P}) = \sum_{i=1}^n m_i (x_i - x_{i-1}), \\\\ &U(f, \mathcal{P}) = \sum_{i=1}^n M_i (x_i - x_{i-1}). \end{align*} \] The lower Riemann integral and the upper Riemann integral of \(f\) over \([a, b]\) are as follows respecively: \[ \begin{align*} &\underline{\int_a^b} f(x)dx = \sup_{\mathcal{P}} L(f, \mathcal{P}), \\\\ &\overline{\int_a^b} f(x)dx = \inf_{\mathcal{P}} U(f, \mathcal{P}). \end{align*} \] A function \(f\) is said to be Riemann integrable on \([a, b]\) if \[ \underline{\int_a^b} f(x)dx = \overline{\int_a^b} f(x)dx = \alpha. \] When this holds, the common value \(\alpha\) is called the Riemann integral of \(f\) over \([a, b]\): \[ \int_a^b f(x)dx = \alpha. \] Note: Equivalently, we can use the norm of the partition \(\mathcal{P}\): \[ \| \mathcal{P} \| = \max_{1 \leq i \leq n} (x_i - x_{i-1}). \] then a function \(f\) is said to be Riemann integrable on \([a, b]\) if for any \(\| \mathcal{P} \|\), \[ \lim_{ \| \mathcal{P} \| \to 0} L(f, \mathcal{P}) = \lim_{ \| \mathcal{P} \| \to 0} U(f, \mathcal{P}) = \alpha. \]

Some Important facts:

If a function \(f\) is Riemann integrable on \([a, b]\), then \(f\) is bounded on \([a, b]\).
A continuous function on \([a, b]\) is Riemann integrable on \([a, b]\).
A monotonic function on \([a, b]\) is Riemann integrable on \([a, b]\).

Improper Riemann Integration

In many cases, especially in statistics, a function may fail to be Riemann integrable due to unbounded intervals or singularities. However, such functions can often still be integrated using the improper Riemann integral, which extends the classical Riemann integral by incorporating limits to handle these issues. Fortunately, the improper Riemann integral covers a wide range of integrals commonly encountered in statistics and machine learning.

Case 1: One side of the interval is unbounded

Let \(f: [a, \infty) \to \mathbb{R} \) be a function. \(f\) is improperly integrable as follows: \[ \int_a^{\infty} f(x)dx = \lim_{b \to \infty} \int_a^b f(x)dx. \] if the limit exists and is finite. Similarly, for \(f: (-\infty, b] \to \mathbb{R} \), \[ \int_{-\infty}^b f(x)dx = \lim_{a \to -\infty} \int_a^b f(x)dx. \]

Example: \[ \int_0^{\infty} e^{-x} dx = \lim_{b \to \infty} \int_0^b e^{-x} dx = 1 \].

Case 2: Both sides of the interval is unbounded

For a function \(f: \mathbb{R} \to \mathbb{R}\), \[ \int_{-\infty}^{\infty} f(x)dx = \lim_{a \to -\infty}\lim_{b \to \infty} \int_a^b f(x)dx \] if the double limit exists and is finite.

Example: A classic example is the Gaussian integral: \[ \int_{-\infty}^{\infty} e^{-x^2} dx = \sqrt{\pi}. \]
Note: Suppose \(f(x)\) is an odd function. \(\int_{-\infty}^{\infty}f(x)dx \) is NOT computed as \(\int_{-a}^{a}f(x)dx =0\).
\[ \begin{align*} \int_{-\infty}^{\infty} \frac{x}{\pi(1+x^2)}dx &= \lim_{a \to -\infty}\lim_{b \to \infty} \int_a^b \frac{x}{\pi(1+x^2)}dx \\\\ &= \frac{1}{\pi}\lim_{a \to -\infty}\lim_{b \to \infty} \frac{1}{2}(\ln(1+b^2) - \ln(1+a^2)) \end{align*} \] We encountered an indeterminate form. So, the integral does not converge. This result leads to the fact that the mean of the Cauchy distribution is undefined.

Case 3: Discontinuous somewhere on the interval

If a function \(f\) has a singularity at \(c \in (a, b)\), meaning it is not defined or diverges to infinity at \(c\), the improper integral is defined as: \[ \int_a^b f(x)dx = \lim_{\epsilon \to 0^+} \left(\int_a^{c-\epsilon}f(x)dx + \int_{c+\epsilon}^b f(x)dx \right) \] provided both improper integrals on the right-hand side exist finitely.

Similarly, if the singularity occurs at an endpoint of the interval, we define: \[ \int_a^b f(x)dx = \lim_{\epsilon \to 0^+} \int_{\epsilon}^b f(x)dx \] if \(f\) has a singularity at \(a\), or \[ \int_a^b f(x)dx = \lim_{\epsilon \to 0^+} \int_a^{b - \epsilon} f(x)dx \] if \(f\) has a singularity at \(b\).

Example: \[ \int_0^1 \frac{1}{\sqrt{x}}dx = \lim_{\epsilon \to 0^+} \int_{\epsilon}^1 \frac{1}{\sqrt{x}} dx = 2 \].

Limitation of the (Improper) Riemann integration

Consider the Dirichlet function: \[ f(x)= \begin{cases} 1 &\text{if \(x \in \mathbb{Q}\)} \\ 0 &\text{if \(x \in \mathbb{R} \setminus \mathbb{Q}\)} \end{cases} \]
Can we compute \(\int_0^1 f(x)dx\) using the Riemann integral or the improper Riemann integral?

Both rational and irrational numbers are dense in the interval \([0, 1]\), meaning every subinterval contains infinitely many of both. For any partition, the supremum is 1 because each subinterval contains rationals, and the infimum is 0 because irrationals are also present in every subinterval. Thus, since the upper and lower Riemann sums never converge to the same value, the function is not Riemann integrable.

Moreover, the Dirichlet function is discontinuous everywhere in \([0, 1]\). While improper Riemann integrals can handle isolated discontinuities (such as singularities), they cannot handle an infinite, dense set of discontinuities like this.

To integrate such functions, we need a more powerful framework—this is where measure theory and the Lebesgue integral come into play.

Note: Formally, a set \(A \subset \mathbb{R}\) said to be dense on \(\mathbb{R}\) if for any \(x, y \in \mathbb{R}\), there exsits \(a \in A\) such that \(x < a < y\).