Riemann Integrals

link link

When we were looking at differentiable functions, the motivation was to find the slope of a function at a point which resulted in the derivative of a function.

For Integrals there are two main motivations. The first one is that given a function $f$ we want to find the area between the function and the x-axis for a given interval $[a,b]$.

Todo

Image

The second motivation is that given a derivative of a function $f'$ we want to find the original function $F$, which is also often called the antiderivative or the primitive function. So for example if we had $f'(x)=1$ then we know that $F(x)=x$ is a potential primitive function. This is not the only one though, we could also have $F(x)=x+3$ or $F(x)=x-5$. We will later on see why this is the case and how we can find the general form of the primitive function.

Riemann Sums

Let’s start with the first motivation. We want to find the area between a function $f$ and the x-axis for a given interval $[a,b]$. The idea starts with by looking at how this is done for the simplest function, the constant function $f(x)=c$.

For a constant function the area between the function and the x-axis is simply the height of the function times the width of the interval. So for a constant function $f(x)=c$ on the interval $[a,b]$ we have the area $A$ given by:

\[A = c \cdot (b-a) \]

Todo

Image

Notice that if $c=0$ then the area is also 0, which makes sense because it would just be a line on the x-axis. If $c$ is negative then the area is also negative because the difference is always positive because $a \leq b$ and a positive time negative is negative.

We extend this idea to a function that is constant on a finite number of subintervals. A so called step function. For example, if we have a function that is constant on the intervals $[a_1,b_1]$, $[a_2,b_2]$, …, $[a_n,b_n]$ then the area is given by:

\[A = c_1 \cdot (b_1-a_1) + c_2 \cdot (b_2-a_2) + ... + c_n \cdot (b_n-a_n) \]

Where $c_i$ is the height, i.e. the value of the function on the interval $[a_i,b_i]$. We can write such a function also using the summation notation and an indicator function. First let us define the the Intervals as $I_n = [a_n,b_n]$. We then define the indicator function $\chi_{I_n}(x)$ as follows:

\[\chi_{I_n}(x) = \begin{cases} 1 & \text{if } x \in I_n \\ 0 & \text{otherwise} \end{cases} \]

So if $x$ is in the interval $I_n$ then the indicator function is 1, otherwise it is 0. We can then write the step function as follows:

\[g(x) = \sum_{i=1}^{n} c_i \cdot \chi_{I_i}(x) \]

and the area under the curve of the step function $g$ on the interval $[a,b]$ is given by:

\[A = \sum_{i=1}^{n} c_i \cdot (b_i - a_i) \]

This is what leads us to the Riemann sum and the Integral. The idea is that we can approximate a function $f$ by a step function. So more specifically we define a new function $g$ that where we take the original function $f$ and partition the interval $[a,b]$ into $n$ subintervals $I_i$. We then replace the original function values with some constant value $c_i$ on each subinterval or just simply a constant function $g_n$.

More formally we are given a function $f: [a,b] \to \mathbb{R}$ where $I=[a,b]$ is the interval we want to integrate over. We then define a partition $P$ of the interval $I$ as a finite subset defined as follows:

\[P = \{a=x_0 < x_1 < ... < x_n = b\} \subset I \]

Importantly we also have $\{a,b\} \subseteq P$. So a partition is just a finite set of points that must include the endpoints of the interval $[a,b]$. The fact that we have a finite set of points is also why we don’t say that $P \subseteq I$ but rather $P \subset I$ as we want to exclude the case where $P$ is the whole interval $I$ and therefore an infinite set of points. We can then also define the set of all partitions of an interval $I$ as follows:

\[\mathcal{P}(I) = \{P \subseteq I \mid P \text{ is a finite set and } \{a,b\} \subseteq P\} \]

We define the length of an interval $I = [x_{i-1},x_i]$ as the difference between the endpoints:

\[\delta_i = x_i - x_{i-1} \]

The so called norm of the partition $P$ is defined as the maximum length of the intervals in the partition:

\[\|P\| = \max_{1 \leq i \leq n} \delta_i = \max_{1 \leq i \leq n} (x_i - x_{i-1}) \]

So now if we go back to our initial idea where we split the interval $[a,b]$ into $n$ subintervals $I_i = [a_i,b_i]$ then every point $\delta_i \in I_i$ fullfills the following condition:

\[x_{i-1} \leq \delta_i \leq x_i \]

We can then define the Riemann sum as follows where $\delta = \{\delta_1, \delta_2, ..., \delta_n\}$ is a set of points from each interval $I_i$, so $\delta_i \in I_i$. These points are often called the sample or evaluation points. The Riemann sum is then defined as follows:

\[S(f, P, \delta) = \sum_{i=1}^{n} f(\delta_i) \cdot (x_i - x_{i-1}) \]

Where $f(\delta_i)$ is the value of the function $f$ at the point $\delta_i$ and $\delta_i$ is a point in the interval $[a_i,b_i]$ where we evaluate the function $f$. As we have seen this is just a step function that approximates the original function $f$ and what we are doing is summing up the areas of the rectangles that are formed by the function values and the width of the intervals. As the number of intervals $n$ increases, the Riemann sum approaches the area under the curve of the function $f$ on the interval $[a,b]$.

Once we have a partition $P$ the choice of the sample points $\delta_i$ can have an impact on the value of the Riemann sum. However, as long as the function $f$ is bounded on the interval $[a,b]$, we can still obtain a valid approximation of the area under the curve there are 5 key choices for the sample points that are commonly used. These choices are often referred to as Riemann rules:

Left rule: Also known as the left Riemann sum, this rule chooses the left-hand endpoint of each sub-interval $I_i$ as the sample point $\delta_i$.

\[S_{\text{left}}(f,P)=\sum_{i=1}^{n}f(x_{i-1})\,(x_{i}-x_{i-1}). \]

Right rule: Also known as the right Riemann sum, this rule chooses the right-hand endpoint of each sub-interval $I_i$ as the sample point $\delta_i$.

\[S_{\text{right}}(f,P)=\sum_{i=1}^{n}f(x_{i})\,(x_{i}-x_{i-1}). \]

Midpoint rule: Also known as the middle Riemann sum, this rule chooses the midpoint of each sub-interval $I_i$ as the sample point $\delta_i$. The midpoint is calculated as $m_i = \frac{x_{i-1} + x_i}{2}$.

\[S_{\text{mid}}(f,P)=\sum_{i=1}^{n}f\!\left(m_{i}\right)\,(x_{i}-x_{i-1}). \]

These three prescriptions generate step functions whose rectangles touch the graph of $f$ at the left edge, the right edge, or the midpoint of each $I_{i}$ respectively.

Todo

Image

However the most important rules definition-wise are the so-called upper rule or upper Riemann sum and the lower rule or lower Riemann sum. The upper Riemann sum is defined as follows:

\[S_\text{upper}(f, P) = \sum_{i=1}^{n} (\sup_{x \in I_i} f(x)) \cdot (x_i - x_{i-1}) \]

Where $\sup_{x \in I_i} f(x)$ is the supremum of the function $f$ on the interval $I_i$. The lower Riemann sum is defined as follows:

\[S_\text{lower}(f, P) = \sum_{i=1}^{n} (\inf_{x \in I_i} f(x)) \cdot (x_i - x_{i-1}) \]

These two can be interpreted as that for the upper Riemann sum we look at the maximum value of the function on the interval $I_i$ and for the lower Riemann sum we look at the minimum value of the function on the interval $I_i$ and use those values as our constant values for the step function.

Todo

Image

It then follows that the area under the curve of the bounded function $f$ on the interval $[a,b]$ is bounded by the upper and lower Riemann sums:

\[S_\text{lower}(f, P) \leq S(f, P, \delta) \leq S_\text{upper}(f, P) \]

Why does the function need to be bounded? Because if the function is not bounded then the supremum and infimum of the function on the interval $I_i$ can be infinite and therefore the Riemann sums would also be infinite. In addition because the function $f$ is bounded, i.e. there exists a constant $M$ such that $|f(x)| \leq M$ for all $x \in [a,b]$, then there is an upper and lower bound for the values of the function on the interval $I_i$:

\[-M \leq \inf_{x \in I_i} f(x) \leq \sup_{x \in I_i} f(x) \leq M \]

and therefore we can also define an upper and lower bound for the area under the curve of the function $f$ on the interval $[a,b]$:

\[-M \cdot (b-a) \leq S_\text{lower}(f, P) \leq S(f, P, \delta) \leq S_\text{upper}(f, P) \leq M \cdot (b-a) \]

You can think of these upper and lower bounds as the area of the rectangles that are formed by the maximum and minimum values of the function on the interval $I_i$ multiplied by the width of the interval.

Todo

Image

We can also refine a partition $P$ by adding more points to it. So we say that a partition $Q$ is a refinement of a partition $P$ if $P \subset Q$. From this it also logically follows that the union of two partitions $P$ and $Q$ is also a partition, i.e. $P \cup Q \in \mathcal{P}(I)$ and therfore also the union has a common refinement.

We can now use define new Riemann sums for the refined partition $Q$. If the function $f$ is bounded on the interval $[a,b]$ then we get the following inequality for the Riemann sums:

\[S_\text{lower}(f, P) \leq S_\text{lower}(f, Q) \leq S(f, Q, \delta) \leq S_\text{upper}(f, Q) \leq S_\text{upper}(f, P) \]

Todo

Image

The idea behind this is that the more points we add to the partition, the more accurate our approximation of the area under the curve becomes. From this it also follows that we can define “the lowest” lower Riemann sum and “the highest” upper Riemann sum as follows:

\[S_\text{lower}(f) = \sup_{P \in \mathcal{P}(I)} S_\text{lower}(f, P) \quad \text{and} \quad S_\text{upper}(f) = \inf_{P \in \mathcal{P}(I)} S_\text{upper}(f, P) \]

This “lowest” lower Riemann sum is the Lower Riemann Integral and the “highest” upper Riemann sum is the Upper Riemann Integral. It obviously follows that the lower Riemann integral is less than or equal to the upper Riemann integral:

\[\sup_{P \in \mathcal{P}(I)} S_\text{lower}(f, P) \leq \inf_{P \in \mathcal{P}(I)} S_\text{upper}(f, P) \Rightarrow S_\text{lower}(f) \leq S_\text{upper}(f) \]

Todo

Image

We therefore say that a function $f$ is Riemann integrable on the interval $[a,b]$ if the lower and upper Riemann integrals are equal:

\[S_\text{lower}(f) = S_\text{upper}(f) \]

This value is then called the Riemann Integral of the function $f$ on the interval $[a,b]$ and is denoted by:

\[\int_{a}^{b} f(x) dx = S_\text{lower}(f) = S_\text{upper}(f) \]

You can think of the symbol $\int$ as a stylized letter “S” for “Sum”, which is what we are doing here, we are summing up the areas of the rectangles that are formed by the function values and the width of the intervals. The symbol $dx$ is just like the notation in the derivative that indicates a small change in the variable $x$, here between the two points $x_{i-1}$ and $x_i$ in the partition $P$. We call the function $f$ the integrand, the interval $[a,b]$ the integration interval, $a$ the lower limit of integration, and $b$ the upper limit of integration and $x$ the variable of integration. Because the lower and upper limit of integration are defined we also call this a definite integral.

Example

Let’s check if the constant function $f(x)=c$ is Riemann integrable on the interval $[a,b]$ and compute the integral.

So we want to show that the upper and lower Riemann sums are equal for all partitions $P$ of the interval $[a,b]$. We take an arbitrary partition $P=\{x_0,\dots ,x_n\}$ of $[a,b]$ and any choice of sample points $\delta=\{\delta_1,\dots ,\delta_n\}$. Because $f(x)=c$ is constant we get:

\[\sup_{x\in I_i} f(x) = \inf_{x\in I_i} f(x) = c \quad \text{for each sub-interval } I_i=[x_{i-1},x_i]. \]

Therefore:

\[S_\text{upper}(f,P)=\sum_{i=1}^{n} c\,(x_i-x_{i-1})=c\sum_{i=1}^{n} (x_i-x_{i-1}) = c(b-a) = S_\text{lower}(f,P) \]

Thus $S_\text{upper}(f,P)=S_\text{lower}(f,P)$ for every $P$. This also means that if we take $\inf_P S_\text{upper}(f,P)$ and $\sup_P S_\text{lower}(f,P)$ we get the same value:

\[S_\text{upper}(f)=S_\text{lower}(f)=c(b-a). \]

Since they are equal, $f$ is Riemann integrable on $[a,b]$ and we get the Riemann integral:

\[\int_{a}^{b} c\,dx = c(b-a). \]

This matches our initial intuition that the area under a constant function is just the height of the function times the width of the interval.

Example

Next we look at dirchlet’s function on the interval $[0,1]$ which is defined as follows:

\[g(x)=\begin{cases} 1,&x\in\mathbb{Q}\\ 0,&x\notin\mathbb{Q} \end{cases} \]

We have already seen that this function is nowhere continuous, as it oscillates between 0 and 1 for every point in the interval $[0,1]$ but it is bounded, as it only takes the values 0 and 1. We now want to check if this function is Riemann integrable on the interval $[0,1]$.

For any partition $P=\{x_0,\dots ,x_n\}$ of the interval $[0,1]$ each interval $I_i=[x_{i-1},x_i]$ contains both rational and irrational numbers. Hence we have:

\[\sup_{x\in I_i} g(x)=1,\qquad\inf_{x\in I_i} g(x)=0.$ \]

Therefore if we compute the upper and lower Riemann sums for any partition $P$ we get:

\[\begin{align*} S_\text{upper}g,P)=\sum_{i=1}^{n} 1\,(x_i-x_{i-1}) = 1\cdot(1-0)=1 \\ S_\text{lower}g,P)=\sum_{i=1}^{n} 0\,(x_i-x_{i-1}) = 0. \end{align*} \]

Both values are independent of the choice of $P$ so when we take the infimum and supremum over all partitions we get:

\[\begin{align*} S_\text{upper}g)=\inf_{P}S_\text{upper}g,P)=1, \\ S_\text{lower}g)=\sup_{P}S_\text{lower}g,P)=0. \end{align*} \]

Since $S_\text{upper}g)\neq S_\text{lower}g)$, the function $g$ is not Riemann integrable on the interval $[0,1]$. This example underlines why continuity (or at least “almost everywhere” continuity) is crucial for Riemann integrability.

Riemann’s Criterion

We have seen a way to determine whether a bounded function is Riemann integrable by checking the equality of the upper and lower Riemann sums. Hwoever, just like we have cauchy’s criterion for convergence of sequences, we can also define a criterion for Riemann integrability, this is called Riemann’s criterion.

The criterion states a bounded function $f:[a,b]\to\mathbb R$ is Riemann integrable if the following condition holds:

\[\forall\;\varepsilon>0\;\exists\;P\in\mathcal P([a,b])\;:\; S_{\text{upper}}(f,P)-S_{\text{lower}}(f,P)<\varepsilon. \]

Intuitively this means that you can make the gap between the upper and lower Riemann sums arbitrarily small by choosing a suitable partition $P$. This criterion is equivalent to the condition that the upper and lower Riemann integrals are equal, i.e. we have the following equivalence:

\[S_{\text{lower}}(f)=S_{\text{upper}}(f)\quad\Longleftrightarrow\quad \bigl(\forall\varepsilon>0\;\exists P\text{ with }S_{\text{upper}}(f,P)-S_{\text{lower}}(f,P)<\varepsilon\bigr). \]

Proof

We can prove this equivalence by showing both directions. First we assume $f$ is integrable so $S\_{\text{lower}}(f)=S\_{\text{upper}}(f)=A$ where $A$ is the Riemann integral of $f$ on the interval $[a,b]$. If we then take some $\varepsilon>0$. By definition of the supremum there exists a partition $P\_1$ such that

\[S_{\text{upper}}(f,P_1)<I+\tfrac{\varepsilon}{2}, \]

and a partition $P\_2$ with

\[S_{\text{lower}}(f,P_2)>I-\tfrac{\varepsilon}{2}. \]

If we then take the common refinement of these two partitions, we can show that the gap between the upper and lower Riemann sums can be made arbitrarily small.

\[S_{\text{lower}}(f,P)\ge S_{\text{lower}}(f,P_2)>I-\tfrac{\varepsilon}{2},\qquad S_{\text{upper}}(f,P)\le S_{\text{upper}}(f,P_1)<I+\tfrac{\varepsilon}{2}, \]

so the gap is smaller than $\varepsilon$. We can also show the converse direction so suppose the gap can be made arbitrarily small, then we have:

\[L:=\sup_{P}S_{\text{lower}}(f,P),\qquad U:=\inf_{P}S_{\text{upper}}(f,P). \]

Because all lower sums are less than or equal to all upper sums so:

\[L=\sup_{P}S_{\text{lower}}(f,P)\leq\inf_{P}S_{\text{upper}}(f,P)=U. \]

We also have that $L\leq U$. Given $\varepsilon>0$, choose $P$ with $U-L\le S_{\text{upper}}(f,P)-S_{\text{lower}}(f,P)< \varepsilon$ completing the proof.

This idea as mentioned is similar to Cauchy’s criterion for convergence of sequences, where we can make the difference between the upper and lower sums arbitrarily small by refining the partition. This is also what leads to another common way to think about the Riemann integral, which is that it is the limit of the Riemann sums as the number of intervals $n$ goes to infinity, i.e. as the norm of the partition $\|P_n\|$ goes to 0:

\[\int_a^b f(x)\,dx =\lim_{n\to\infty}S(f,P_n,\delta^{(n)}) \quad\text{for any choice of sample points }\delta^{(n)}. \]

The criterion above guarantees that the limit exists and is the same for all ways of picking the samples, provided the upper-lower gap tends to $0$.

Example

Let’s look again at the linear function $f(x)=x$ on the interval $[a,b]$ and check if it is Riemann integrable using Riemann’s criterion. First we will take a uniform partition $P_n=\{a + i\cdot \frac{b-a}{n} \mid i=0,1,\dots,n\}$ of the interval $[a,b]$ with $n$ subintervals. This results in each subinterval $|I_i|$ having the same width $h = \frac{b-a}{n}$.

Then let us use this partition to calculate the lower and upper Riemann sums. Note that we are using the left-endpoint for the lower sum and the right-endpoint for the upper sum. In this case this makes sense because the function $f(x)=x$ is increasing on the interval $[a,b]$. The lower Riemann sum is given by:

\[\begin{align*} S_{\text{lower}}(f,P_n) &=\sum_{i=1}^{n} f(x_{i-1})\,(x_i-x_{i-1}) \\ &=\sum_{i=1}^{n}\Bigl(a+(i-1)h\Bigr)\,h \\ &=h\Bigl[na+h\sum_{i=1}^{n}(i-1)\Bigr] \\ &=h\Bigl[na+h\,\frac{(n-1)n}{2}\Bigr] \\ &=\frac{b-a}{n}\left[na+\frac{b-a}{n}\,\frac{(n-1)n}{2}\right] \\ &=(b-a)a+\frac{(b-a)^2}{2}\left(\frac{\,n-1\,}{n}\right). \end{align*} \]

Analog for the upper Riemann sum using the right-endpoint:

\[\begin{align*} S_{\text{upper}}(f,P_n) &=\sum_{i=1}^{n} f(x_i)\,(x_i-x_{i-1}) \\ &=\sum_{i=1}^{n}\Bigl(a+i\,h\Bigr)\,h \\ &=h\Bigl[na+h\sum_{i=1}^{n}i\Bigr] \\ &=h\Bigl[na+h\,\frac{n(n+1)}{2}\Bigr] \\ &=\frac{b-a}{n}\left[na+\frac{b-a}{n}\,\frac{n(n+1)}{2}\right] \\ &=(b-a)a+\frac{(b-a)^2}{2}\left(\frac{\,n+1\,}{n}\right). \end{align*} \]

Now if we subtract the lower sum from the upper sum, we get:

\[\begin{align*} S_{\text{upper}}(f,P_n)-S_{\text{lower}}(f,P_n) &=\frac{(b-a)^2}{2}\left(\frac{n+1}{n}-\frac{n-1}{n}\right) \\ &=\frac{(b-a)^2}{2}\left(\frac{2}{n}\right) \\ &=\dfrac{(b-a)^2}{n}. \end{align*} \]

So for a given $\varepsilon > 0$, we can choose $n$ such that $\frac{(b-a)^2}{n} < \varepsilon$. This means that for any $\varepsilon > 0$, we can find a partition $P_n$ such that the difference between the upper and lower Riemann sums is less than $\varepsilon$. Therefore, by Riemann’s criterion, the function $f(x)=x$ is Riemann integrable on the interval $[a,b]$ and we can compute the integral by taking the limit as $n$ goes to infinity for the lower or upper Riemann sums:

\[\begin{align*} \lim_{n\to\infty}S_{\text{lower}}(f,P_n) &= \lim_{n\to\infty} (b-a)a + \frac{(b-a)^2}{2} (\frac{n-1}{n}) \\ &= (b-a)a + \frac{(b-a)^2}{2} \\ &= (b-a)\left(a + \frac{b-a}{2}\right) \\ &= \frac{(b-a)(a+b)}{2} \\ &= \frac{b^2-a^2}{2} \end{align*} \]

Properties of Riemann Integrals

Integrals of Monotonic and Bounded Functions

There are many properties of Riemann integrals that are useful to know. The first one we have already seen in effect when solving the example of the identity function $f(x)=x$ on the interval $[a,b]$. The first property is a bit similar to the monotone convergence theorem for sequences, which states that if a sequence is monotonic and bounded, then it converges. Something similar holds for Riemann integrals, which is that if a function $f$ is monotonic and bounded on the interval $[a,b]$, then it is Riemann integrable on that interval.

Proof

The proof is rather intuitive. If $f$ is monotonic increasing, then the infimum of the function is always equivalent to the left rule and the supremum is always equivalent to the right rule. So if we take a uniform partition of the interval $[a,b]$ we get the following:

\[\begin{align*} S_{\text{lower}}(f,P_n) - S_{\text{upper}}(f,P_n) &= \sum_{i=1}^{n} (\sup_{x \in I_i} f(x))(x_i - x_{i-1}) - \sum_{i=1}^{n} (\inf_{x \in I_i} f(x))(x_i - x_{i-1}) \\ &= \sum_{i=1}^{n} f(x_{i+1}) \frac{b-a}{n} - \sum_{i=1}^{n} f(x_i) \frac{b-a}{n} \\ &= \frac{b-a}{n} \sum_{i=1}^{n} (f(x_{i+1}) - f(x_i)) \\ &= \frac{b-a}{n} (f(x_{n}) - f(x_{n-1}) + f(x_{n-1}) - f(x_{n-2}) + ... + f(x_1) - f(x_0)) \\ &= \frac{b-a}{n} (f(x_{n}) - f(x_0)) \\ &= \frac{b-a}{n} (f(b) - f(a)). \end{align*} \]

So for any $\varepsilon > 0$, we can choose $n$ such that $\frac{b-a}{n} (f(b) - f(a)) < \varepsilon$. Another way to determine this is that as $n$ goes to infinity, the difference between the upper and lower Riemann sums goes to 0 showing that the function is Riemann integrable on the interval $[a,b]$.

Linearity of Riemann Integrals

When looking at differentiable functions we saw that we could extract some rules for the derivative of a function such as the product rule, the quotient rule, the chain rule and others. In the motivation for the integral it was also mentioned that the integral is the inverse operation of the derivative. This means that we can also extract some rules for the integral of a function based on the rules for the derivative.

Let $f$ and $g$ be bounded and Riemann integrable on the interval $I=[a,b]\subset\mathbb R$ then we often group these functions together and denote them as $f,g\in\mathcal R(I)$, the set of all bounded and Riemann integrable functions on the interval $I$.

The summation rule for Riemann integrals states that if $f,g\in\mathcal R([a,b])$ then the sum $f+g$ is also Riemann integrable on $[a,b]$ and the integral of the sum is equal to the sum of the integrals:

\[\int_a^b\!\bigl(f(x)+g(x)\bigr)\,dx \;=\; \int_a^b f(x)\,dx\;+\;\int_a^b g(x)\,dx. \]

Proof

For any partition $P$ with points $\delta$ we have:

\[S(f+g,P,\delta)=S(f,P,\delta)+S(g,P,\delta). \]

Also just like with the derivative we can also scale the function by a real scalar $\alpha\in\mathbb R$ and still obtain a Riemann integrable function. This is called scalar multiplication and is defined as follows:

\[\int_a^b \alpha\,f(x)\,dx \;=\; \alpha \!\int_a^b f(x)\,dx. \]

Proof

A Riemann sum of $\alpha f$ is:

\[S(\alpha f,P,\delta)=\sum_i \alpha f(\delta_i)\,\Delta x_i =\alpha\,S(f,P,\delta). \]

Together the scalar multiplication and the summation rule imply full linearity so we have:

\[\forall\,\alpha,\beta\in\mathbb R:\qquad \int_{a}^{b}\bigl(\alpha f(x)+\beta g(x)\bigr)\,dx \;=\;\alpha\!\int_{a}^{b}f(x)\,dx \;+\;\beta\!\int_{a}^{b}g(x)\,dx. \]

This also means that the following linear mapping exists and that we can define a vector space of Riemann integrable functions on the interval $[a,b]$:

\[\mathcal I_{[a,b]}:\;f\longmapsto\int_a^b f(x)\,dx \]

We also have a product rule for Riemann integrals, which states that if $f,g\in\mathcal R([a,b])$ then the point-wise product $fg$ is also Riemann integrable on $[a,b]$:

\[\int_{a}^{b} f(x)g(x)\,dx \quad\text{exists.} \]

Proof

no clue

If we have the functions $f,g\in\mathcal R([a,b])$ and $g(x) \neq 0$ for all $x\in[a,b]$, then we can also say that the quotient $f/g$ is Riemann integrable on $[a,b]$:

\[\int_a^b \frac{f(x)}{g(x)}\,dx \quad\text{exists.} \]

Proof

no clue

Area Between Two Curves

A good example that strengthens intuition.

Polynomials are Integrable

From the previously mentioned rules we can easily derive that polynomials are also integrable. Because a polynomial $p(x)$ can be written as the sum of monomials, i.e. $p(x) = a_0 + a_1 x + a_2 x^2 + \dots + a_n x^n$, where $a_i$ are real coefficients, we can apply the summation rule and the scalar multiplication rule to each monomial separately. So every polynomial is Riemann integrable due to the fact that the integral is linear.

\[\int_a^b p(x)\,dx = \int_a^b a_0 + a_1 x + a_2 x^2 + \dots + a_n x^n\,dx \quad \text{exists.} \]

If $q(x)$ is a polynomial and $q(x) \neq 0$ so it doesn’t have any roots in the interval $[a,b]$, then we can also compute the integral of any rational function $f(x) = \frac{p(x)}{q(x)}$ where $p(x)$ is also a polynomial. This is because we can use the product rule and the quotient rule to compute the integral of the rational function.

Additivity over Subintervals

If we have an integrable function $f$ on an interval $[a,b]$, we can split the integral over the interval into two integrals over sub-intervals $[a,c]$ and $[c,b]$ where $c$ is any point in the interval $(a,b)$ so $a < c < b$. This is known as the additivity property of integrals:

\[\int_{a}^{b} f(x)\,dx \;=\;\int_{a}^{c} f(x)\,dx +\int_{c}^{b} f(x)\,dx. \]

Proof

The proof of this follows from the definition of the Riemann integral being a sum. We already know from the rules of the sum operator that we can split sums into two parts, we just need to make sure that the Partitions we use for the two integrals are compatible.

Example

This is especially useful when we have a piecewise continuous function, i.e. a function that is continuous on each sub-interval but may have discontinuities at the endpoints of the sub-intervals. Let’s look at the following example:

\[f(x)=\begin{cases} x, & -1 \leq x\le 2,\\ 2, & 2 < x \leq 3. \end{cases} \]

We then take the interval $[-1,3]$ and split it at the point $c=2$:

\[\begin{align*} \int_{-1}^{3}f(x)\,dx &=\int_{-1}^{2}x\,dx+\int_{2}^{3}2\,dx \\ &=\Bigl[\tfrac{x^{2}}{2}\Bigr]_{-1}^{2}+2(3-2) \\ &=\tfrac{3}{2}+2 \\ &=\tfrac{7}{2}. \end{align*} \]

Notice the first part of the graph lies partly below the $x$-axis, but the additivity formula still holds.

Zero-width and Limit Orientation

We can create a so called degenerate interval by taking the same point as both the lower and upper limit of integration, i.e. $a=b$. In this case, the integral is defined as follows:

\[\int_{a}^{b}f(x)\,dx=0. \]

The reason is that every partition only has one point, so the Riemann sum is always zero, as there is no width to the interval.

The integral also has some orientation rules, which are important to know. Specifically if we swap the lower and upper limit of integration, the sign of the integral changes. This is because the Riemann sums are defined in such a way that the order of the limits matters. If we exchange all $x_{i-1}$ and $x_i$ in the Riemann sums, we change the sign of each summand, which leads to the following:

\[\int_{a}^{b}f(x)\,dx=-\int_{b}^{a}f(x)\,dx. \]

Uniform Continuity

Continuity is a fundamental property of functions that we have encountered in calculus. Continuity also has some implications for the Riemann integral.

First we recall the definition of continuity at a point, which is also called point-wise continuity using the epsilon–delta definition. A function $f:I\to\mathbb R$ is continuous at a point $x_0\in I$ if we have the following:

\[\forall\varepsilon>0\;\exists\delta>0: |x-x_0|<\delta\;\Longrightarrow\;|f(x)-f(x_0)|<\varepsilon. \]

We have also seen that if this condition holds for every point $x_0$ in the Domain $D$ of the function, then we say that the function is continuous. The same holds if the domain is an interval $I$, i.e. the function is continuous on the interval $I$ if it is continuous at every point in $I$. More formally, we can write $f$ is continuous on $I$ if the following holds:

\[\forall x_0\in[a,b]\;\forall\varepsilon>0\;\exists\delta>0: |x-x_0|<\delta \;\Longrightarrow\; |f(x)-f(x_0)|<\varepsilon. \]

Ordinary continuity already tells us that for each point $x_0$ we can shrink the input window $(x_0-\delta,x_0+\delta)$ until all outputs stay within an $\varepsilon$-band of $f(x_0)$.

Uniform continuity insists on something stronger. If we pick a $\varepsilon$ then we need to be able to pick at least one $\delta$ such that every pair of points in the interval are within $\delta$ of each other such that there outputs stay within an $\varepsilon$-band of each other.

Geometrically meaning?

Because the same $\delta$ serves everywhere, the logical quantifiers flip for a function $f:I\to\mathbb R$ on an interval $I$ to be uniformly continuous:

\[\forall\varepsilon>0\;\exists\delta>0\; \text{s.t. }|x-y|<\delta\;\Longrightarrow\;|f(x)-f(y)|<\varepsilon \quad\forall x,y\in I. \]

So the key difference ist that the chosen $\delta$ does not depend on the particular point $x_0$ in the interval, but only on the $\varepsilon$ we choose.

Luckily it turns out that every continuous function on a compact interval is uniformly continuous. This is known as the Heine–Cantor theorem.

Proof

no clue

Example

Let $f(x)=\sqrt{x}$ on $(0,1]$. It is continuous on $(0,1]$ but not uniformly continuous there?

On $[0,1]$, however, $f$ becomes uniformly continuous

As mentioned above, every continuous function on a closed interval $[a,b]$ is uniformly continuous. This is a very important property of continuous functions and is often used in analysis. This is also the reason why we can conclude that every continuous function on a closed interval is Riemann integrable, so we have for every continuous function $f:[a,b]\to\mathbb R$:

\[f:[a,b]\to\mathbb R\text{ is continuous} \quad\Longrightarrow\quad \int_a^b f(x)\,dx \text{ exists.} \]

Proof

Uses Riemann’s criterion and uniform continuity.

However, the converse is not true, i.e. not every Riemann integrable function is continuous.

Example

We revisit $f(x)=x^{2}$ on the interval $[0,1]$. Every polynomial is continuous on $\mathbb R$, hence also on any closed interval—because. By linearity we also know it is Riemann integrable on $[0,1]$. If we didn’t know that, we could still have found the integral with the knowledge that $f$ is continuous on $[0,1]$ and hence uniformly continuous there.

To calculate the integral we use the uniform partition:

\[P_n=\Bigl\{x_k=\frac{k}{n}\,\Big|\,k=0,1,\dots,n\Bigr\},\qquad \|P_n\|=\frac1n. \]

And because on the interval $[0,1]$ the function $f(x)=x^{2}$ is increasing, we can use the right endpoints as sample points for our Riemann sums.

\[\begin{align*} S\bigl(f,P_n\bigr) &=\sum_{k=1}^{n} f(x_k)\,\Delta x_k =\sum_{k=1}^{n}\!\Bigl(\tfrac{k}{n}\Bigr)^2\!\cdot\!\frac1n \\ &=\frac{1}{n^{3}}\sum_{k=1}^{n}k^{2}. \end{align*} \]

Using the slightly less known formula then the gaussian sum for the sum of squares:

\[\sum_{k=1}^{n}k^{2}=\frac{n(n+1)(2n+1)}{6} \]

We get:

\[S\bigl(f,P_n\bigr)= \frac{1}{n^{3}}\cdot\frac{n(n+1)(2n+1)}{6} = \frac{(n+1)(2n+1)}{6n^{2}}. \]

If we then take the limit as $n$ goes to infinity, so the partition gets finer and finer and $\|P_n\|\to 0$, we get the Riemann integral:

\[\begin{aligned} \int_{0}^{1}x^{2}\,dx &=\lim_{n\to\infty}S(f,P_n) \\ &=\lim_{n\to\infty}\frac{(n+1)(2n+1)}{6n^{2}} \\ &=\lim_{n\to\infty}\frac{2n^{2}+3n+1}{6n^{2}} \\ &=\frac{2}{6} =\frac{1}{3}. \end{aligned} \]

For later on: An antiderivative of $x^{2}$ is $\tfrac{x^{3}}{3}$; evaluating it at the endpoints using the Fundamental Theorem of Calculus gives us:

\[\Bigl[\tfrac{x^{3}}{3}\Bigr]_{0}^{1}=\frac{1}{3}-0=\frac{1}{3}, \]

confirming the Riemann-sum result.

Example

We can define the following piecewise function:

\[h(x)=\begin{cases} \sin x,&0\le x<\pi,\\ 1,&\pi\le x\le 2\pi. \end{cases} \]

The function $h$ has a single jump at $x=\pi$. However, it is still Riemann integrable on the interval $[0,2\pi]$ because the set of discontinuities (the single point $\{\pi\}$) has measure zero. We can compute the integral of $h$ over the interval $[0,2\pi]$ by splitting it into two parts, one for each piece of the function:

\[\int_{0}^{2\pi}\!h(x)\,dx =\int_{0}^{\pi}\!\sin x\,dx +\int_{\pi}^{2\pi}\!1\,dx =\bigl[{-\cos x}\bigr]_{0}^{\pi}+(\pi) =(1+1)+\pi =2+\pi. \]

Example

A classical counter-example for the converse is Thomae’s function which is defined as follows:

\[t(x)=\begin{cases} \frac1q,&x=\frac{p}{q}\text{ in lowest terms},\\[6pt] 0,&x\notin\mathbb Q, \end{cases}\qquad x\in[0,1], \]

The function is discontinuous at every rational point in the interval $[0,1]$ and continuous at every irrational point. However, it is still Riemann integrable on the interval $[0,1]$ because the set of discontinuities (the rationals in $[0,1]$) has measure zero which leads to the following integral:

\[\int_0^1 t(x)\,dx = 0. \]

Useful Inequalities

When dealing with Riemann integrals, there are several useful inequalities that can help us understand the behavior of integrals and their relationships just like the properties of limits, derivatives and integrals mentioned above.

The first is the so called monotonicity inequality which states that if $f$ and $g$ are Riemann integrable functions on the interval $[a,b]$ and $f(x)\le g(x)$ for every $x\in[a,b]$, then the integral of $f$ is less than or equal to the integral of $g$:

\[f(x) \leq g(x)\quad\forall x\in[a,b] \quad\Longrightarrow\quad \int_{a}^{b}f(x)\,dx \;\le\;\int_{a}^{b}g(x)\,dx. \]

The intuition behind this is rather clear. If you think of the two graphs where both the values are positive then they are stacked one above the other. Then every vertical slice at $x$ has $f$‘s height below $g$‘s height, so each slender rectangle forming the “$f$-area” sits completely inside the corresponding “$g$-area”. A similar argument holds if the graphs are negative, or if one is positive and the other negative.

The next inequality is the absolute-value inequality. This inequality is similar to the triangle inequality for vectors and sums. It states that the absolute value of the integral of a function is less than or equal to the integral of the absolute value of the function:

\[\Bigl|\int_{a}^{b}f(x)\,dx\Bigr| \;\le\;\int_{a}^{b}|f(x)|\,dx. \]

The intuition is also rather obvious as we defined the integral as the limit of Riemann sums, which are sums of rectangles. The absolute value inequality states that the area of the rectangles is always positive, so the absolute value of the integral is less than or equal to the integral of the absolute value of the function.

The last inequality is a bit more complex. We know that the integrable functions form a vector space. For the real vector space there is the Cauchy–Schwarz inequality which states that for any two vectors $\mathbf{v},\mathbf{w}\in\mathbb R^{n}$, the following holds:

\[\mathbf{v}\cdot\mathbf{w} \leq \|\mathbf{v}\| \|\mathbf{w}\|, \]

where $\mathbf{v}\cdot\mathbf{w}$ is the dot product of the two vectors, and $\|\mathbf{v}\|$ and $\|\mathbf{w}\|$ are the norms of the vectors. For the integrable functions, we can define an inner product as follows:

\[\mathbf{f}\cdot\mathbf{g} = <f,g> = \int_{a}^{b}f(x)g(x)\,dx, \]

So we can also define the norm of a function as follows:

\[\|\mathbf{f}\| = \sqrt{\int_{a}^{b}f(x)^{2}\,dx}. \]

Then the Cauchy–Schwarz inequality for integrals states that for any two integrable functions $f,g\in\mathcal R([a,b])$, the following holds:

\[\Bigl|⟨f,g⟩\Bigr| \leq \|f\|\,\|g\|. \]

This results in the following Cauchy-Schwarz inequality for integrals:

\[\Bigl|\int_{a}^{b}f(x)g(x)\,dx\Bigr| \leq \Bigl(\int_{a}^{b}f(x)^{2}\,dx\Bigr)^{1/2}\Bigl(\int_{a}^{b}g(x)^{2}\,dx\Bigr)^{1/2}. \]

Proof

already kind of given?

Mean Value Theorem for Integrals

We have already seen the Mean Value Theorem for Derivatives, which states that if a function is continuous on a closed interval $[a,b]$ and differentiable on the open interval $(a,b)$, then there exists at least one point $c\in(a,b)$ such that:

\[f'(c) = \frac{f(b) - f(a)}{b - a}. \]

This is a very important theorem in calculus and has many applications. For integrals there is also a similar theorem known as the Mean Value Theorem for Integrals. It states that if a function $f$ is continuous on a closed interval $[a,b]$and therefore also integrable, then there exists at least one point $c\in[a,b]$ such that:

\[\int_{a}^{b}f(x)\,dx = f(c)(b-a). \]

So in other words, we can find a point $c$ in the interval $[a,b]$ such that the area under the curve of the function $f$ over the interval $[a,b]$ is equal to the area of a rectangle with height $f(c)$ and width $b-a$. So in other words we can not only approximate the integral with a rectangle but also the height is the value of the function at some point in the interval.

Todo

Add image

Proof

She doesn’t show this.

Example

The condition for that the function $f$ is continuous on the interval $[a,b]$ is very important. We can easily show this with the following example for $f: [0,1]\to\mathbb R$:

\[f(x)=\begin{cases} 0 & \text{if } 0 \leq x < \frac{1}{2}, \\ 1 & \text{if } \frac{1}{2} \leq x \leq 1. \end{cases} \]

This function is not continuous at $x=\frac{1}{2}$, but it is still Riemann integrable on the interval $[0,1]$ because the set of discontinuities (the single point $\{\frac{1}{2}\}$) has measure zero:

\[\int_{0}^{1}f(x)\,dx = \int_{0}^{\frac{1}{2}}0\,dx + \int_{\frac{1}{2}}^{1}1\,dx = 0 + \frac{1}{2} = \frac{1}{2}. \]

However, there is no point $c\in[0,1]$ such that $\frac{1}{2} = f(c)(1-0) = f(c)$, because $f(c)$ is either $0$ or $1$ for all $c\in[0,1]$. So the Mean Value Theorem for Integrals does not hold for this function.

Cauchy also extended the Mean Value Theorem for Integrals to two functions. So if $f,g: [a,b]\to\mathbb R$ are continuous and bounded (therefore also Riemann integrable) on the interval $[a,b]$ and $g(x) \geq 0$ for all $x\in[a,b]$, then there exists at least one point $c\in[a,b]$ such that:

\[\int_{a}^{b}f(x)g(x)\,dx = f(c)\int_{a}^{b}g(x)\,dx. \]

First notice that this is a generalization of the Mean Value Theorem for Integrals, because if we take $g(x) = 1$ for all $x\in[a,b]$, then we get the original theorem.

Todo

What is the intuition behind this? Why does it hold?

Proof

This is all a bit confusing as I dont get the intuition. But i know it uses the Mean Value Theorem for Integrals.

Antiderivatives

As mentioned in the introduction there are two motivations for the integral. The first one was to find the area under a curve this resulted in us defining the definite integral as the limit of Riemann sums. This is a very important concept in calculus and has many applications in physics, engineering, and other fields.

The second motivation is that given the derivative of a function, can we reconstruct the original function? This is what results in the antiderivative or primitive function. The antiderivative is a function that, when differentiated, gives us the original function. This is the inverse operation of differentiation and is also known as integration. More formally we say that a function $F:I\to\mathbb R$ is an antiderivative of a function $f:I\to\mathbb R$ on an interval $I$ if it is continuous and differentiable on $I$ and satisfies the following condition:

\[F'(x) = f(x)\quad\forall x\in I. \]

This leads us to the main theorem of calculus, which states that the if we define the antiderivative $F$ of a continous function $f$ as:

\[F(x) = \int_{a}^{x}f(t)\,dt\quad \forall x\in[a,b]. \]

then $F$ is continuous and differentiable on the interval $[a,b]$ so then the derivative of $F$ is equal to the function $f$:

\[F'(x) = f(x)\quad \forall x\in[a,b]. \]

The intuition behind defining the antiderivative as the integral of the function $f$ from a fixed point $a$ to a variable upper limit $x$ is that we are accumulating the values of the function $f$ over the interval $[a,x]$. This means that as we move the upper limit $x$ along the interval, we are summing up the values of the function $f$ from the point $a$ to the point $x$.

The intuition defining the antiderivative as the integral of the function $f$ from a fixed point $a$ to a variable upper limit $x$ is that we are accumulating the values of the function $f$ over the interval $[a,x]$ like a sliding window. As that window slides, you keep a running total of the area between the graph of $f(t)$ and the t-axis.

So if we then take a tiny step from $x$ to $x + h$ then our sliding window scoops up a thin vertical strip whose width is approximately $h$ and whose height is approximately $f(x)$ (the function’s value doesn’t change much over such a small span). So the extra area you gain is roughly the rectangle’s area, $f(x) h$. If we then let that step size $h$ go to zero, the rate at which $F$ grew over that tiny move is:

\[\lim_{h\to 0}\frac{F(x+h) - F(x)}{h} = \lim_{h\to 0}\frac{\int_{a}^{x+h}f(t)\,dt - \int_{a}^{x}f(t)\,dt}{h} = \lim_{h\to 0}\frac{\int_{x}^{x+h}f(t)\,dt}{h}. \]

Which leads us to the definition of the derivative of $F$ at the point $x$, showing that the derivative of the antiderivative $F$ is equal to the function $f$.

Todo

Add image

So we summarize the main theorem of calculus states that the integral and the derivative are inverse operations and the following holds:

\[F(x) = \int_{a}^{x}f(t)\,dt\quad \forall x\in[a,b] \qquad\text{and} \qquad F'(x) = f(x)\quad \forall x\in[a,b]. \]

Proof

To prove the main theorem of calculus, we need to show that the function $F$ defined as the integral of $f$ from a fixed point $a$ to a variable upper limit $x$ is continuous and differentiable on the interval $[a,b]$, and that its derivative is equal to the function $f$. So we need to show the following limit exists:

\[\lim_{x \to x_0} \frac{F(x) - F(x_0)}{x - x_0} = f(x_0). \]

First we can calculate the difference of the two points as follows:

\[F(x) - F(x_0) = \int_{a}^{x}f(t)\,dt - \int_{a}^{x_0}f(t)\,dt = \int_{x_0}^{x}f(t)\,dt. \]

From the mean value theorem for integrals we know that there exists a point $c\in[x_0,x]$ such that:

\[\int_{x_0}^{x}f(t)\,dt = f(c)(x - x_0). \]

This means that we can rewrite the difference as follows:

\[\begin{align*} F(x) - F(x_0) = f(c)(x - x_0) \\ \frac{F(x) - F(x_0)}{x - x_0} = f(c). \end{align*} \]

Now we can again take the limit as $x$ approaches $x_0$. Importantly as $x$ approaches $x_0$, the point $c$ also approaches $x_0$ because $c$ is always between $x_0$ and $x$. So we can write:

\[\begin{align*} F'(x_0) &= \lim_{x \to x_0} \frac{F(x) - F(x_0)}{x - x_0} \\ &= \lim_{x \to x_0} f(c) \\ &= f(x_0). \end{align*} \]

Indefinite Integrals

Because of the main theorem of calculus we know that every continuous function $f$ on the interval $[a,b]$ has at least one antiderivative $F$ Specifically:

\[F(x) = \int_{a}^{x}f(t)\,dt\quad \forall x\in[a,b]. \]

If we assume there are two antiderivatives $F$ and $G$ of $f$ on the interval $[a,b]$, so $F'(x) = G'(x) = f(x)$ for all $x \in [a,b]$. Then we can define a new function $H$ as the difference of the two antiderivatives:

\[H(x) = F(x) - G(x)\quad \forall x\in[a,b]. \]

If we would take the derivative of this new function $H$, we would get:

\[\begin{align*} H'(x) &= F'(x) - G'(x) \\ &= f(x) - f(x) \\ &= 0\quad \forall x\in[a,b]. \end{align*} \]

This means that the derivative of $H$ is zero for all $x\in[a,b]$. Because the derivative exists we can apply the Mean Value Theorem for derivatives. So for any $x_1,x_2\in[a,b]$ with $x_1<x_2$, the MVT gives a point $c\in(x_1,x_2)$ such that:

\[\begin{align*} H'(c) &= \frac{H(x_2) - H(x_1)}{x_2 - x_1} \\ &= 0. \end{align*} \]

Since the choice of $x_1,x_2$ is arbitrary for the above, $H$ takes the same value everywhere on $[a,b]$. Hence $H(x)=C$ for some constant $C\in\mathbb R$ which results in the following:

\[\begin{align*} H(x) &= C \\ F(x) - G(x) &= C \\ F(x) &= G(x) + C \quad \forall x\in[a,b]. \end{align*} \]

This shows that any two antiderivatives of a continuous function differ by nothing more than the additive constant $C$. This constant is the so-called integration constant. This means that the antiderivative is not unique, but rather unique up to a constant. This matches with our idea of the Integral as the inverse operation of the derivative. As we have:

\[(2x)' = 2 = (2x + 1)' = (2x + 2)' = (2x + C)' \quad\forall C\in\mathbb R. \]

So we actually have an infinite number of antiderivatives for a given function $f$ on the interval $[a,b]$. Hence we can write the following:

\[\int f(x)\,dx = F(x) + C \]

This is called the indefinite integral of the function $f$ and represents the family of all antiderivatives of $f$ on the interval $[a,b]$.

Fundamental Theorem of Calculus

We can now connect the two motivations for the integral, the area under a curve and the antiderivative to make the Fundamental Theorem of Calculus (FTC). The FTC states that if $f$ is a continuous function on the interval $[a,b]$, then the following holds:

\[\int_{a}^{b}f(x)\,dx = F(b) - F(a) \]

This means that the definite integral of a function $f$ over the interval $[a,b]$ can be computed by just finding an antiderivative $F$ of $f$ and evaluating it at the endpoints of the interval. This is a very powerful result because it allows us to compute integrals without having to compute Riemann sums or limits and it combines all the different aspects of calculus in a single theorem.

Proof

For the proof of the Fundamental Theorem of Calculus we first note the following:

\[\begin{align*} F(x) = \int_{a}^{x}f(t)\,dt \\ F(b) = \int_{a}^{b}f(t)\,dt \\ F(a) = \int_{a}^{a}f(t)\,dt = 0. \end{align*} \]

Therefore we can write the definite integral as follows:

\[\int_{a}^{b}f(x)\,dx = F(b) - 0 = F(b) - F(a). \]

We also know that for any antiderivative $G$ we have $G = F + C$ for some constant $C$. Now if we use we get:

\[\begin{align*} G(b) = F(b) + C \\ G(a) = F(a) + C \\ G(b) - G(a) = (F(b) + C) - (F(a) + C) = F(b) - F(a). \end{align*} \]

Putting it all together we get for any antiderivative $G$ of $f$ on the interval $[a,b]$:

\[\int_{a}^{b}f(x)\,dx = G(b) - G(a). \]

Example

We can now easily calcualte the integral of the function $f(x)=x$ on the interval $[1,2]$ using the Fundamental Theorem of Calculus:

\[\begin{align*} \int_{1}^{2}x\,dx &= \Bigl[\tfrac{1}{2}x^{2}\Bigr]_{1}^{2} \\ &= \tfrac{1}{2}\cdot 2^{2} - \tfrac{1}{2}\cdot 1^{2} \\ &= \tfrac{4}{2} - \tfrac{1}{2} \\ &= \tfrac{3}{2}. \end{align*} \]

The reason for the fraction is because of the power rule the 2 will be taken down as a factor and the exponent will be reduced by one. By adding the fraction we can elimate the factor 2 when taking it down.

Example

We can also calculate the integral of the function $f(x)=x^2$ on the interval $[0,1]$ using the Fundamental Theorem of Calculus and verify the result from the example above:

\[\begin{align*} \int_{0}^{1}x^{2}\,dx &= \Bigl[\tfrac{1}{3}x^{3}\Bigr]_{0}^{1} \\ &= \tfrac{1}{3}\cdot 1^{3} - \tfrac{1}{3}\cdot 0^{3} \\ &= \tfrac{1}{3} - 0 \\ &= \tfrac{1}{3}. \end{align*} \]

Example

We can also calculate more tricky integrals using the Fundamental Theorem of Calculus. For example, we can calculate the integral of the function $f(x)=cos(x)$ on the interval $[0, \frac{\pi}{2}]$:

\[\begin{align*} \int_{0}^{\frac{\pi}{2}} \cos(x) \, dx &= \Bigl[\sin(x)\Bigr]_{0}^{\frac{\pi}{2}} \\ &= \sin(\frac{\pi}{2}) - \sin(0) \\ &= 1 - 0 = 1. \end{align*} \]

We can also look at the partner of cosine the sine function. For example, we can calculate the integral of the function $f(x)=sin(x)$ on the interval $[0, \pi]$:

\[\begin{align*} \int_{0}^{\pi} \sin(x) \, dx &= \Bigl[-\cos(x)\Bigr]_{0}^{\pi} \\ &= -\cos(\pi) - (-\cos(0)) \\ &= -(-1) - (-1) \\ &= 1 - (-1) \\ &= 1 + 1 \\ &= 2. \end{align*} \]

Integration by Parts

As mentioned a lot of the integration rules and methods can be derived from the rules of differentiation. One such method is integration by parts. This method is based on the product rule of differentiation and is used to integrate the product of two functions which is why it is also known as partial integration or product integration. It is particularly useful when one of the functions is easy to differentiate and the other is easy to integrate for example $h(x) = x \cdot e^x$. Or in general when the integrand can be expressed or rewritten as a product of two functions:

\[\int{h(x)\,dx}=\int{f(x) \cdot g(x)\,dx} \]

where $f(x)$ is a function that is easy to differentiate and $g(x)$ is a function that is easy to integrate. The integration by parts formula is then given by:

\[\int{f(x)\cdot g'(x)\,dx} = f(x)\cdot g(x) - \int{f'(x)\cdot g(x)\,dx} \]

Proof

This formula is derived from the product rule of differentiation. We can rewrite the product rule as follows:

\[\begin{align*} (f(x) \cdot g(x))' &= f'(x) \cdot g(x) + f(x) \cdot g'(x) \\ f'(x) \cdot g(x) &= (f(x) \cdot g(x))' - f(x) \cdot g'(x). \end{align*} \]

If we then take the integral of both sides and using the fact that the integral of a derivative is the original function (up to a constant), we get:

\[\begin{align*} \int{f'(x) \cdot g(x)\,dx} &= \int\Bigl((f(x) \cdot g(x))' - f(x) \cdot g'(x)\Bigr)\,dx \\ &= \int{(f(x) \cdot g(x))'\,dx} - \int{f(x) \cdot g'(x)\,dx} \\ &= f(x) \cdot g(x) - \int{f(x) \cdot g'(x)\,dx} + C, \end{align*} \]

For a definite integral, the corresponding formula is

\[\int_a^b{u(x)\cdot v'(x)\,dx} = \Bigl|u(x)\cdot v(x)\Bigr|_a^b - \int_a^b{u'(x)\cdot v(x)\,dx} \]

Proof

We define the function $H$ as follows:

\[H(x) = f(x) \cdot g(x). \]

We then know that if we differentiate $H$ using the product rule, we get:

\[H'(x) = f'(x) \cdot g(x) + f(x) \cdot g'(x). \]

Because we know that $H$ is an antiderivative of the above summation, we can write the following using the Fundamental Theorem of Calculus:

\[\begin{align*} \int{H'(x)\,dx} = \int{f'(x) \cdot g(x) + f(x) \cdot g'(x)\,dx} \\ &= H(b) - H(a) \\ &= f(b) \cdot g(b) - f(a) \cdot g(a). \end{align*} \]

If we were then to just integrate the second term, we can use the above equation to rewrite it as follows:

\[\begin{align*} \int{f'(x) \cdot g(x)\,dx} + \int{f(x) \cdot g'(x)\,dx} = H(b) - H(a) \\ \int{f'(x) \cdot g(x)\,dx} = H(b) - H(a) - \int{f(x) \cdot g'(x)\,dx} \\ \int{f'(x) \cdot g(x)\,dx} = f(b) \cdot g(b) - f(a) \cdot g(a) - \int{f(x) \cdot g'(x)\,dx}. \end{align*} \]

Which gives us the integration by parts formula.

Example

We want to solve the following problem which would be rather difficult to solve otherwise:

\[\int_{0}^{1}{x\cdot e^x \, dx} \]

First we split the integrand as described above and get:

\[\begin{align*} f(x) &= x\\ f'(x) &= 1\\ g'(x) &= e^x\\ g(x) &= e^x \end{align*} \]

Using the integration by parts formula we can now compute the integral:

\[\begin{align*} \int{x\cdot e^x \, dx} &= \Bigl[x\cdot e^x\Bigr]_{0}^{1} - \int_{0}^{1}{1\cdot e^x \, dx} \\ &= \Bigl[1\cdot e^1 - 0\cdot e^0\Bigr] - \Bigl[e^x\Bigr]_{0}^{1} \\ &= e - (e^1 - e^0) \\ &= e - (e - 1) \\ &= e - e + 1 \\ &= 1. \end{align*} \]

Because multiplication is commutative, we can also swap the order of the factors in the integrand, i.e. we can also write:

\[\int{x\cdot e^x \, dx} = \int{e^x \cdot x \, dx} \]

and then use the same method with:

\[\begin{align*} f(x) &= e^x\\ f'(x) &= e^x\\ g'(x) &= x\\ g(x) &= \frac{1}{2}x^2 \end{align*} \]

However, we can see that this is a suboptimal choice because it quickly gets a bit complicated to find the antiderivative:

\[\begin{align*} \int{e^x \cdot x \, dx} &= \Bigl[e^x \cdot \frac{1}{2}x^2\Bigr]_{0}^{1} - \int_{0}^{1}{e^x \cdot x \, dx} \\ &= \Bigl[e^1 \cdot \frac{1}{2}\cdot 1^2 - e^0 \cdot \frac{1}{2}\cdot 0^2\Bigr] - \int_{0}^{1}{e^x \cdot x \, dx} \\ &= \Bigl[e - 0\Bigr] - \int_{0}^{1}{e^x \cdot x \, dx} \\ &= e - \int_{0}^{1}{e^x \cdot x \, dx}. \end{align*} \]

Example

I nice trick is to make a function into a multiplication so that we can use integration by parts. For example, we can rewrite the integral of the function $f(x)=\ln(x)$ as follows:

\[\int \ln(x)\,dx = \int \ln(x) \cdot 1\,dx. \]

This allows us to use integration by parts with the following choices:

\[\begin{align*} f(x) &= \ln(x)\\ f'(x) &= \frac{1}{x}\\ g'(x) &= 1\\ g(x) &= x. \end{align*} \]

This gives us the following integral:

\[\begin{align*} \int \ln(x)\,dx &= \Bigl[\ln(x) \cdot x\Bigr] - \int \frac{1}{x} \cdot x\,dx \\ &= \Bigl[\ln(x) \cdot x\Bigr] - \int 1\,dx \\ &= \Bigl[\ln(x) \cdot x\Bigr] - x + C \\ &= x \cdot \ln(x) - x + C. \end{align*} \]

Integration by Substitution

Another important method for solving integrals is integration by substitution. This method is also known as u-substitution or change of variables. The method comes from the chain rule of differentiation and is used to simplify integrals by changing the variable of integration. It is particularly useful when the integrand is a composition of functions, i.e. when it can be expressed as $f(g(x))$ where $g(x)$ is a function whose derivative also appears in the integrand. So we can best use this method when we have an integral in the form:

\[\int{f(g(x))\cdot g(x)'\,dx} \]

The idea is that with a suitable change of variable $u = g(x)$, we can transform the integral into one that is easier to evaluate. The formula for definite integrals where $g:[a,b]\to\mathbb R$ is a continuous differentiable function and its image $g(x) \subseteq I$ and $f:[I]\to\mathbb R$ is a continuous function is:

\[\int_{a}^{b}{f(g(x))\cdot g'(x)\,dx} = \int_{g(a)}^{g(b)}{f(u)\,du} \]

and for indefinite integrals:

\[\int{f(g(x))\cdot g'(x)\,dx} = \int{f(u)\,du} = F(u) + C = F(g(x)) + C, \]

You can think of the substitution as re-labelling the $x$-axis. The point $x=a$ maps to $u=g(a)$, the point $x=b$ maps to $u=g(b)$, and the differential $dx$ transforms to $du$ via the chain rule:

\[du = g'(x)\,dx \quad\Longrightarrow\quad \frac{du}{dx} = g'(x) \quad\Longrightarrow\quad dx = \frac{du}{g'(x)}. \]

Proof

Because $f$ is said to be continous so there exists an antiderivative $H$. We also know that $g$ is a continuous differentiable function, so we can apply the chain rule to get the following:

\[(H \circ g)'(x) = H'(g(x)) \cdot g'(x) = f(g(x)) \cdot g'(x). \]

If we then also integrate both sides and use the Fundamental Theorem of Calculus, we get:

\[\begin{align*} \int{f(g(x))\cdot g'(x)\,dx} = \int{(H \circ g)'(x)\,dx} \\ &= (H \circ g)(b) - (H \circ g)(a) \\ &= H(g(b)) - H(g(a)) \\ &= \int_{g(a)}^{g(b)}{f(u)\,du}. \end{align*} \]

Where $H$ is an antiderivative of $f$ and $u=g(x)$.

Example

Let’s look at an example of how to use integration by substitution to solve an integral. We want to solve the following integral:

\[\int_{0}^{1} (1+x^{2})^{2025}\,2x\,dx. \]

We can set $u = 1 + x^2$, then we have $du = 2x\,dx$, which gives us $dx = \frac{du}{2x}$.

\[\begin{align*} \int_{x=0}^{x=1} (1+x^{2})^{2025}\,2x\,dx &= \int_{u(0)}^{u(1)} u^{2025}\,du \\ &= \int_{1}^{2} u^{2025}\,du. \end{align*} \]

We then get take the antiderivative and substitute back to get the final result:

\[\int_{1}^{2} u^{2025}\,du = \Bigl[\frac{1}{2026}u^{2026}\Bigr]_{1}^{2} = \frac{1}{2026}\Bigl(2^{2026} - 1^{2026}\Bigr) = \frac{1}{2026}(2^{2026} - 1). \]

Example

We can also calculate the following integral:

\[\int{2x^3 + 1)^7 x^2 \,dx}. \]

We can use integration by substitution to solve this integral. We can set $u = 2x^3 + 1$. If we take the derivative we get $\frac{du}{dx} = 6x^2$, which gives us $dx = \frac{du}{6x^2}$. Using this substitution we can rewrite the integral as follows:

\[\begin{align*} \int{(2x^3 + 1)^7 x^2 \,dx} &= \int{u^7 \cdot x^2 \cdot \frac{du}{6x^2}} \\ &= \int{\frac{1}{6}u^7 \,du} \\ &= \frac{1}{6} \cdot \frac{u^8}{8} + C \end{align*} \]

We can then substitute back $u = 2x^3 + 1$ to get the final result:

\[\begin{align*} \frac{1}{6} \cdot \frac{(2x^3 + 1)^8}{8} + C &= \frac{(2x^3 + 1)^8}{48} + C. \end{align*} \]

We can also check our result by differentiating the result and see if we get the original integrand back:

\[\begin{align*} (\frac{(2x^3 + 1)^8}{48} + C)' &= \frac{1}{48} \cdot 8(2x^3 + 1)^7 \cdot (6x^2) \\ &= \frac{1}{6}(2x^3 + 1)^7 x^2. \end{align*} \]

Example

Let’s look at another example of how to use integration by substitution to solve an integral. We want to solve the following integral:

\[\int_{0}^{1}{x\cdot \sqrt{1+x^2}} \]

We set $u = 1 + x^2$ and thus we have $du = 2x\,dx$, which gives us $dx = \frac{du}{2x}$.

Next we replace the limits. For the lower limit $x = 0$, so $u = 1+0^2 = 1$. For the upper limit $x = 1$, so $u = 1+1^2 = 2$. Now we substitute everything:

\[\int_{0}^{1}{x\cdot \sqrt{1+x^2}} =\int_{u=1}^{u=2}{x\cdot \sqrt{u}\,\frac{du}{2x}} =\int_{1}^{2}{\frac{1}{2}\cdot \sqrt{u}\,du} =\frac{1}{2}\int_{1}^{2}{u^{\frac{1}{2}}\,du} \]

Because $\bigl(\tfrac{2}{3}u^{\frac{3}{2}}\bigr)' = u^{\frac{1}{2}}$, we write

\[\frac{1}{2}\cdot\Bigl[\frac{2}{3}u^{\frac{3}{2}}\Bigr]_{1}^{2} = \frac{1}{3}\Bigl(2^{\frac{3}{2}} - 1^{\frac{3}{2}}\Bigr) = \frac{1}{3}\Bigl(2\sqrt{2} - 1\Bigr). \]

Example

We can also use integration by substitution to solve indefinite integrals. The only difference is that we don’t have to change the limits of integration. For example, we can solve the following integral:

\[\int{x\cdot e^{x^2}\,dx}. \]

because it is not readily solvable otherwise. Here $u(x)=x^2$, whose derivative is $u'(x)=2x$, but we only have $x$ and not $2x$ in the integrand. The reason is the constant factor, which says we can factor out the 2; therefore we may ignore constants in the above condition.

We have already completed the first step by identifying the substitution variable $u=x^2$. Next, we must replace everything involving the old variable, including the $dx$. To do this we use the relation:

\[u = x^2 \quad\Longrightarrow\quad u' = 2x \quad\Longrightarrow\quad u' = \frac{du}{dx} \quad\Longrightarrow\quad du = 2x\,dx. \]

Now we can substitute everywhere in the integral:

\[\int{x\cdot e^{x^2}}=\int{x\cdot e^u\, \frac{du}{2x}} \]

Thanks to the condition above, the leading $x$ cancels:

\[\int{\frac{e^u\,du}{2}}=\int{\frac{1}{2}\cdot e^u\,du}=\frac{1}{2}\int{e^u\,du} \]

We now have a basic integral, and since the derivative/integral of $e^u$ is $e^u$, we can solve it:

\[\frac{1}{2}\int{e^u\,du}=\frac{1}{2}e^u + C \]

Often we want to return to the original variable, so we perform a back-substitution:

\[\frac{1}{2}e^u + C=\frac{1}{2}e^{x^2} + C \]

Integration using Partial Fractions

Todo

Integration using partial fraction decomposition is a specialized technique developed for proper rational functions. If a rational function is improper, it must first be decomposed into a polynomial part and a proper rational part by polynomial division. This transformation is always possible.

A video on polynomial division is available here .

If the degree $m$ of the denominator is greater than the degree $n$ of the numerator, the rational function $f(x)$ is called proper.

\[\text{proper rational function: }f(x)=\frac{x^3+x^2+x+1}{x^4+3x+3} \]\[\text{improper rational function: }f(x)=\frac{x^3+x^2+x+1}{x^2+5x+1} \]

Moreover, the denominator should already be factored into linear factors. If this is not the case, it can be achieved quickly by finding the zeros.

A video on factoring into linear factors is available here .

After that, the partial fraction decomposition can be carried out easily, and the integral of each partial fraction can be computed thanks to the sum rule.

You can find a video on the entire process here .

Improper Integrals

So far we have only looked at integrals over compact intervals $[a,b]$ where $a$ and $b$ are finite real numbers. However, there are also integrals over non-compact intervals, i.e. intervals that extend to infinity. These integrals are called improper integrals.

\[\int_{a}^{\infty}f(x)\,dx \quad\text{or}\quad \int_{-\infty}^{b}f(x)\,dx \quad\text{or}\quad \int_{-\infty}^{\infty}f(x)\,dx. \]

Or also integrals over functions that are not bounded, i.e. where the function $f$ goes to infinity at some point in the interval $[a,b]$. These integrals are also called improper integrals. For example, we can look at the following integral:

\[\int_{0}^{1}\frac{1}{\sqrt{x}} \,dx. \text{ or } \int_{0}^{\infty}\frac{1}{(1-x)^2} \,dx. \]

As $x$ approaches $0$, the function $\frac{1}{\sqrt{x}}$ goes to infinity, so the integral is improper. However, if we would take any $\epsilon>0$ and look at the interval $[\epsilon,1]$ we would get a finite integral:

\[\begin{align*} \int_{\epsilon}^{1}\frac{1}{\sqrt{x}} \,dx &= \Bigl[\frac{x^{\frac{1}{2}}}{\frac{1}{2}}\Bigr]_{\epsilon}^{1} \\ &= \Bigl[2\sqrt{x}\Bigr]_{\epsilon}^{1} \\ &= 2\sqrt{1} - 2\sqrt{\epsilon} \\ &= 2 - 2\sqrt{\epsilon}. \end{align*} \]

As $\epsilon$ approaches $0$, the integral approaches $2$. So we then say that the improper integral converges to $2$:

\[\int_{0}^{1}\frac{1}{\sqrt{x}} \,dx = \lim_{\epsilon\to 0}\int_{\epsilon}^{1}\frac{1}{\sqrt{x}} \,dx = 2. \]

From this we can see that improper integrals can be defined as limits of proper integrals. This idea is what leads us also to the definition of improper integrals over non-compact intervals. We define the impropert integral for the function $f: [a, \infty) if it is bounded and integrable on the interval $[a,b]$for any$b>a$ as follows:

\[\int_{a}^{\infty}f(x)\,dx = \lim_{b\to\infty}\int_{a}^{b}f(x)\,dx. \]

We then say that $f$ is integrable on $[a, \infty)$ if the improper integral converges. We can also define the improper integral for the function $f: (-\infty, b]$ if it is bounded and integrable on the interval $[a,b]$ for any $b>a$ as follows:

\[\int_{-\infty}^{b}f(x)\,dx = \lim_{a\to-\infty}\int_{a}^{b}f(x)\,dx. \]

We then say that $f$ is integrable on $(-\infty, b]$ if the improper integral converges. If we have an integral over the whole real line, i.e. $(-\infty, \infty)$, we can split it into two parts and use the above definitions:

\[\int_{-\infty}^{\infty}f(x)\,dx = \int_{-\infty}^{a}f(x)\,dx + \int_{a}^{\infty}f(x)\,dx = \lim_{b\to-\infty}\int_{b}^{a}f(x)\,dx + \lim_{c\to\infty}\int_{a}^{c}f(x)\,dx. \]

This means that the improper integral converges if both limits exist and therefore the integral converges.

Example

Let’s look at an example of an improper integral. We want to solve the following integral:

\[\int_{0}^{\infty}e^{-x} \,dx. \]

We can use the definition of improper integrals to solve this integral:

\[\begin{align*} \int_{0}^{\infty}e^{-x} \,dx &= \lim_{b\to\infty}\int_{0}^{b}e^{-x} \,dx \\ &= \lim_{b\to\infty}\Bigl[-e^{-x}\Bigr]_{0}^{b} \\ &= \lim_{b\to\infty}\Bigl[-e^{-b} - (-e^{0})\Bigr] \\ &= \lim_{b\to\infty}\Bigl[-e^{-b} + 1\Bigr] = 1. \end{align*} \]

Example

Another example of an improper integral is the following:

\[\int_{1}^{\infty}\frac{1}{x^c} \,dx. \]

Again we can take the limit, but we need to be careful about the value of $c$:

\[\begin{align*} \int_{1}^{\infty}\frac{1}{x^c} \,dx &= \lim_{b\to\infty}\int_{1}^{b}\frac{1}{x^c} \,dx \\ &= \lim_{b\to\infty} \int_{1}^{b}x^{-c} \,dx \\ &= \lim_{b\to\infty} \begin{cases} \Bigl[\ln(x)\Bigr]_{1}^{b} & \text{if } c=1 \\ \Bigl[\frac{x^{-c +1}}{-c+1}\Bigr]_{1}^{b} & \text{if } c\neq 1 \end{cases} \\ &= \lim_{b\to\infty} \begin{cases} \Bigl[\ln(b) - \ln(1)\Bigr] & \text{if } c=1 \\ \Bigl[\frac{b^{-c +1}}{-c+1} - \frac{1^{-c +1}}{-c+1}\Bigr] & \text{if } c\neq 1 \end{cases}\\ &= \lim_{b\to\infty} \begin{cases} \ln(b) & \text{if } c=1 \\ \frac{b^{-c +1}}{-c+1} - \frac{1}{-c+1} & \text{if } c\neq 1 \end{cases}. \end{align*} \]

So if $c=1$, the integral diverges to infinity. If $c>1$, the integral also diverges to infinity because $b^{-c +1}$ goes to $0$ as $b$ goes to infinity. If $c < 1$, the integral converges to $\frac{1}{c-1}$.

Direct Comparison Test for Integrals

To see if the imporper integral converges or diverges, we can use the direct comparison test.

often cant give concrete antiderivative. example of $e^{-x^2}$.

Integral Test

A test that uses integrals to see if a series converges or diverges.

Double Integrals

Ein gutes Video zu wie man das macht findest du hier