18 Derivatives of Inverse Trig Functions, L'Hopital's Rule, Convex Functions

Lecture from: 29.04.2024 | Video: Video ETHZ

Review: Key Results from the Mean Value Theorem – What the Derivative Tells Us

Let’s quickly remember the main takeaways from the Mean Value Theorem (MVT): If we have a function $f$ that’s nice and smooth (continuous on $[a, b]$ and differentiable on $(a, b)$ ), its derivative $f^{'}$ gives us a lot of information about how $f$ behaves over the whole interval:

If $f^{'} (x)$ is always zero inside the interval, then $f (x)$ must be a constant function. It’s not going up or down.
If $f^{'} (x)$ is always non-negative ( $\geq 0$ ), then $f (x)$ is monotonically increasing (it might flatten out, but it never goes down).
If $f^{'} (x)$ is always strictly positive ( $> 0$ ), then $f (x)$ is strictly monotonically increasing (it’s always going up).
Similarly, if $f^{'} (x)$ is always non-positive ( $\leq 0$ ) or strictly negative ( $< 0$ ), then $f (x)$ is monotonically decreasing or strictly monotonically decreasing, respectively.

And for inverse functions: If $f$ is well-behaved (continuous, strictly monotonic) and its derivative $f^{'} (x_{0})$ isn’t zero, then its inverse $f^{- 1}$ is also differentiable at the corresponding point $y_{0} = f (x_{0})$ , and its derivative is just the reciprocal: $(f^{- 1})^{'} (y_{0}) = \frac{1}{f ^{'} ( x _{0} )}$ .

Example: Derivatives of Inverse Trigonometric Functions – Putting the Inverse Rule to Work

Let’s see how to find the derivatives of the “arc” functions, which are the inverses of our standard trigonometric functions (when restricted to suitable domains to make them invertible).

1. Arcsine Function ( $arcsin$ or $sin^{- 1}$ )

First, we need to make $sin (x)$ invertible. We do this by looking at it only on the interval $[- π /2, π /2]$ .

On this interval, $sin : [- π /2, π /2] \to [- 1, 1]$ does a few nice things:

It’s strictly monotonically increasing. Why? Because its derivative, $sin^{'} (x) = cos (x)$ , is strictly positive for $x$ between $- π /2$ and $π /2$ (think about the cosine graph in that region).
It hits all the values between $- 1$ (at $x = - π /2$ ) and $1$ (at $x = π /2$ ).

Because it’s strictly increasing and covers this range, it’s bijective (one-to-one and onto) from $[- π /2, π /2]$ to $[- 1, 1]$ .

This means we can define an inverse function, which we call arcsine: $arcsin : [- 1, 1] \to [- π /2, π /2]$ .

Now, for its derivative: Since $sin^{'} (x) = cos (x)$ is not zero on $(- π /2, π /2)$ (the interior of the domain where sine is invertible), arcsine will be differentiable on $(- 1, 1)$ (the interior of its domain).

Let $y = sin (x)$ , where $x \in (- π /2, π /2)$ . Then $x = arcsin (y)$ .

Using our rule for the derivative of an inverse function: $arcsin^{'} (y) = \frac{1}{s i n ^{'} ( x )} = \frac{1}{c o s ( x )}$ This is correct, but we usually want the derivative in terms of $y$ . We need to express $cos (x)$ using $y = sin (x)$ .

The fundamental identity $sin^{2} (x) + cos^{2} (x) = 1$ comes to the rescue. So, $cos^{2} (x) = 1 - sin^{2} (x)$ . Since $x$ is in $(- π /2, π /2)$ , $cos (x)$ is positive. Thus, $cos (x) = 1 - sin^{2} (x)$ .

Substituting $y = sin (x)$ , we get $cos (x) = 1 - y^{2}$ . Plugging this back into our derivative formula: $arcsin^{'} (y) = \frac{1}{1 - y ^{2}} for y \in (- 1, 1)$

2. Arccosine Function ( $arccos$ or $cos^{- 1}$ )

The story is very similar for arccosine. We restrict $cos (x)$ to the interval $[0, π]$ . On this interval, $cos : [0, π] \to [- 1, 1]$ is strictly monotonically decreasing (because its derivative $cos^{'} (x) = - sin (x)$ is negative for $x \in (0, π)$ ). It’s also bijective.

The inverse function is $arccos : [- 1, 1] \to [0, π]$ .

It’s differentiable on $(- 1, 1)$ . If $y = cos (x)$ for $x \in (0, π)$ : $arccos^{'} (y) = \frac{1}{c o s ^{'} ( x )} = \frac{1}{- s i n ( x )}$ Using $sin^{2} (x) + cos^{2} (x) = 1$ , and noting that $sin (x) > 0$ for $x \in (0, π)$ , we have $sin (x) = 1 - cos^{2} (x) = 1 - y^{2}$ . So, $arccos^{'} (y) = - \frac{1}{1 - y ^{2}} for y \in (- 1, 1)$

3. Arctangent Function ( $arctan$ or $tan^{- 1}$ )

For the tangent function, $tan (x) = \frac{s i n ( x )}{c o s ( x )}$ , we look at the interval $(- π /2, π /2)$ . Its derivative is $tan^{'} (x) = \frac{1}{c o s ^{2} ( x )}$ . Since $cos^{2} (x)$ is positive for $x \in (- π /2, π /2)$ , $tan^{'} (x)$ is also positive. Thus, $tan : (- π /2, π /2) \to R$ is strictly monotonically increasing.

As $x$ approaches $π /2$ from the left, $tan (x) \to \infty$ . As $x$ approaches $- π /2$ from the right, $tan (x) \to - \infty$ . This means $tan$ maps $(- π /2, π /2)$ bijectively onto the entire real line $R$ .

The inverse function is $arctan : R \to (- π /2, π /2)$ . It’s differentiable everywhere on $R$ . If $y = tan (x)$ for $x \in (- π /2, π /2)$ : $arctan^{'} (y) = \frac{1}{t a n ^{'} ( x )} = \frac{1}{1/ c o s ^{2} ( x )} = cos^{2} (x)$

How to write $cos^{2} (x)$ in terms of $y = tan (x)$ ?

Recall the identity: $1 + tan^{2} (x) = 1 + \frac{s i n ^{2} ( x )}{c o s ^{2} ( x )} = \frac{c o s ^{2} ( x ) + s i n ^{2} ( x )}{c o s ^{2} ( x )} = \frac{1}{c o s ^{2} ( x )}$ .

So, $cos^{2} (x) = \frac{1}{1 + t a n ^{2} ( x )}$ .

Substituting $y = tan (x)$ , we get $cos^{2} (x) = \frac{1}{1 + y ^{2}}$ . Therefore, $arctan^{'} (y) = \frac{1}{1 + y ^{2}} for all y \in R$

L’Hôpital’s Rule: A Clever Trick for Tricky Limits

Sometimes we encounter limits of fractions where both the numerator and the denominator head towards $0$ (an ” $\frac{0}{0}$ form”) or both head towards $\pm \infty$ (an ” $\frac{\infty}{\infty}$ form”). These are called “indeterminate forms” because we can’t tell the limit just by looking. L’Hôpital’s Rule gives us a way to tackle these.

Theorem: L’Hospital Rule

Let $a < b$ . Suppose $f$ and $g$ are differentiable functions from $(a, b)$ to $R$ , and importantly, $g^{'} (x) \neq = 0$ for all $x$ in $(a, b)$ .

Now, assume one of these two scenarios as $x$ approaches $b$ from the left (similar statements hold for $x \to a^{+}$ or $x \to x_{0}$ for some $x_{0} \in (a, b)$ ):

The “0/0” case: $lim_{x \to b^{-}} f (x) = 0$ AND $lim_{x \to b^{-}} g (x) = 0$ .
The " $\pm \infty/ \pm \infty$ " case: $lim_{x \to b^{-}} f (x) = \pm \infty$ AND $lim_{x \to b^{-}} g (x) = \pm \infty$ .

IF the limit of the ratio of their derivatives exists: $lim_{x \to b^{-}} \frac{f ^{'} ( x )}{g ^{'} ( x )} = λ$ (where $λ$ can be a finite number, or $\infty$ , or $- \infty$ ).

THEN, L’Hôpital’s Rule says that the limit of the original ratio also exists and is the same: $lim_{x \to b^{-}} \frac{f ( x )}{g ( x )} = λ$

Important Notes:

$λ$ can be a real number, $\infty$ , or $- \infty$ .
The interval $(a, b)$ can be infinite; for example, $a$ could be $- \infty$ or $b$ could be $\infty$ .
The rule also works for limits from the right ( $x \to a^{+}$ ) or two-sided limits ( $x \to x_{0}$ ).

A Quick Intuition for the “0/0” Case

Why should this rule work? Imagine we’re near a point $x_{0}$ where both $f (x_{0}) = 0$ and $g (x_{0}) = 0$ .

Near $x_{0}$ , functions are well-approximated by their tangent lines: $f (x) \approx T_{f} (x) = f (x_{0}) + f^{'} (x_{0}) (x - x_{0}) = f^{'} (x_{0}) (x - x_{0})$ $g (x) \approx T_{g} (x) = g (x_{0}) + g^{'} (x_{0}) (x - x_{0}) = g^{'} (x_{0}) (x - x_{0})$

So, the ratio $\frac{f ( x )}{g ( x )}$ near $x_{0}$ should be approximately: $\frac{f ( x )}{g ( x )} \approx \frac{f ^{'} ( x _{0} ) ( x - x _{0} )}{g ^{'} ( x _{0} ) ( x - x _{0} )} = \frac{f ^{'} ( x _{0} )}{g ^{'} ( x _{0} )}$ (assuming $g^{'} (x_{0}) \neq = 0$ and $x \neq = x_{0}$ ).

This suggests that the limit of the ratio of the functions might be the ratio of their derivatives. L’Hôpital’s Rule formalizes this.

Examples: L’Hospital’s Rule in Action

Logarithm vs. Power: For $a > 0$ , what is $lim_{x \to \infty} \frac{l n ( x )}{x ^{a}}$ ? As $x \to \infty$ , $ln (x) \to \infty$ and $x^{a} \to \infty$ . This is an " $\frac{\infty}{\infty}$ " case. Let $f (x) = ln (x)$ , so $f^{'} (x) = 1/ x$ . Let $g (x) = x^{a}$ , so $g^{'} (x) = a x^{a - 1}$ .

Now, look at the limit of the ratio of derivatives: $lim_{x \to \infty} \frac{f ^{'} ( x )}{g ^{'} ( x )} = lim_{x \to \infty} \frac{1/ x}{a x ^{a - 1}} = lim_{x \to \infty} \frac{1}{a x ^{a}}$ Since $a > 0$ , as $x \to \infty$ , $a x^{a} \to \infty$ . So, $\frac{1}{a x ^{a}} \to 0$ . Because this limit (0) exists, L’Hôpital’s Rule tells us: $lim_{x \to \infty} \frac{l n ( x )}{x ^{a}} = 0$

The takeaway: The natural logarithm $ln (x)$ grows much slower than any positive power of $x$ .
A $0 \cdot \infty$ form: What is $lim_{x \to 0^{+}} x ln (x)$ ? As $x \to 0^{+}$ , $x \to 0$ and $ln (x) \to - \infty$ . This " $0 \cdot (- \infty)$ " form is indeterminate. We need to rewrite it as a fraction to use L’Hôpital’s. $x ln (x) = \frac{l n ( x )}{1/ x}$ . As $x \to 0^{+}$ , $ln (x) \to - \infty$ and $1/ x \to \infty$ . Now it’s a " $\frac{- \infty}{\infty}$ " form. Let $f (x) = ln (x) ⟹ f^{'} (x) = 1/ x$ . Let $g (x) = 1/ x ⟹ g^{'} (x) = - 1/ x^{2}$ . The limit of the ratio of derivatives: $lim_{x \to 0^{+}} \frac{f ^{'} ( x )}{g ^{'} ( x )} = lim_{x \to 0^{+}} \frac{1/ x}{- 1/ x ^{2}} = lim_{x \to 0^{+}} (- x) = 0$ Since this limit exists, by L’Hôpital’s Rule: $lim_{x \to 0^{+}} x ln (x) = 0$
Clicker Question Revisited: $lim_{x \to 0} \frac{1 - c o s ( 2 x )}{1 - c o s ( x )}$ . As $x \to 0$ , numerator $\to 1 - cos (0) = 0$ , and denominator $\to 1 - cos (0) = 0$ . It’s a " $\frac{0}{0}$ " case.

First application of L’Hôpital: Derivative of numerator: $(1 - cos (2 x))^{'} = - (- sin (2 x) \cdot 2) = 2 sin (2 x)$ . Derivative of denominator: $(1 - cos (x))^{'} = - (- sin (x)) = sin (x)$ . So we consider $lim_{x \to 0} \frac{2 s i n ( 2 x )}{s i n ( x )}$ . This is still " $\frac{0}{0}$ ".

Second application of L’Hôpital: Derivative of new numerator: $(2 sin (2 x))^{'} = 2 (cos (2 x) \cdot 2) = 4 cos (2 x)$ . Derivative of new denominator: $(sin (x))^{'} = cos (x)$ . Now consider the limit: $lim_{x \to 0} \frac{4 c o s ( 2 x )}{c o s ( x )} = \frac{4 c o s ( 0 )}{c o s ( 0 )} = \frac{4 \cdot 1}{1} = 4$ Since this limit exists, our original limit is also 4.

Alternative for the Clicker (without a second L’Hôpital): After the first L’Hôpital step, we had $lim_{x \to 0} \frac{2 s i n ( 2 x )}{s i n ( x )}$ . Using the double angle formula $sin (2 x) = 2 sin (x) cos (x)$ : $lim_{x \to 0} \frac{2 ( 2 s i n ( x ) c o s ( x ))}{s i n ( x )} = lim_{x \to 0} 4 cos (x) (for x \neq = 0 so sin (x) might be zero, but we cancel before taking limit)$ $= 4 cos (0) = 4 \cdot 1 = 4$

A Word of Caution: When L’Hospital’s Rule Fails or Misleads

Consider the limit $lim_{x \to \infty} \frac{2 x + s i n ( x )}{2 x + c o s ( x )}$ . This is an " $\frac{\infty}{\infty}$ " form.

If we try L’Hôpital’s Rule: $f^{'} (x) = 2 + cos (x)$ $g^{'} (x) = 2 - sin (x)$

The limit $lim_{x \to \infty} \frac{f ^{'} ( x )}{g ^{'} ( x )} = lim_{x \to \infty} \frac{2 + c o s ( x )}{2 - s i n ( x )}$ does not exist because $cos (x)$ and $sin (x)$ oscillate.

Crucial Point: If $lim \frac{f ^{'} ( x )}{g ^{'} ( x )}$ does not exist, L’Hôpital’s Rule tells us nothing about the original limit $lim \frac{f ( x )}{g ( x )}$ .

In this specific case, we can find the original limit directly: $lim_{x \to \infty} \frac{2 x + s i n ( x )}{2 x + c o s ( x )} = lim_{x \to \infty} \frac{x ( 2 + \frac{s i n ( x )}{x} )}{x ( 2 + \frac{c o s ( x )}{x} )} = lim_{x \to \infty} \frac{2 + \frac{s i n ( x )}{x}}{2 + \frac{c o s ( x )}{x}}$ Since $∣ sin (x) ∣ \leq 1$ and $∣ cos (x) ∣ \leq 1$ , $lim_{x \to \infty} \frac{s i n ( x )}{x} = 0$ and $lim_{x \to \infty} \frac{c o s ( x )}{x} = 0$ . So the limit is $\frac{2 + 0}{2 + 0} = 1$ .

L’Hôpital’s Rule was not helpful here because its condition (existence of the limit of derivative ratios) was not met.

Rule of Thumb: Only apply L’Hôpital’s Rule if the limit $lim \frac{f ^{'} ( x )}{g ^{'} ( x )}$ actually exists (or is $\pm \infty$ ). Otherwise, you might be led astray or conclude nothing.

Convex Functions

For the remainder of this chapter, we’ll assume $I \subseteq R$ is an interval containing more than one point.

Definition: Convexity

A function $f : I \to R$ is said to be:

Convex on $I$ if for any two points $x, y \in I$ and any $λ \in [0, 1]$ (which parameterizes the segment between $x$ and $y$ ): $f (λ x + (1 - λ) y) \leq λ f (x) + (1 - λ) f (y)$
Strictly convex on $I$ if for any two distinct points $x, y \in I$ ( $x \neq = y$ ) and any $λ \in (0, 1)$ (strictly between $x$ and $y$ ): $f (λ x + (1 - λ) y) < λ f (x) + (1 - λ) f (y)$

Geometric Intuition: The point $λ x + (1 - λ) y$ is a point on the line segment connecting $x$ and $y$ on the x-axis. The value $f (λ x + (1 - λ) y)$ is the function’s value at this point.

The expression $λ f (x) + (1 - λ) f (y)$ represents the y-value on the straight line segment (the “secant line”) connecting the points $(x, f (x))$ and $(y, f (y))$ on the graph of $f$ .

So, convexity means that the graph of the function $f$ always lies below or on any of its secant lines.

A Classic Example: $f (x) = ∣ x ∣$ defined on $R$ .

This function is convex. By the triangle inequality, for any $x, y \in R$ and $λ \in [0, 1]$ : $∣ λ x + (1 - λ) y ∣ \leq ∣ λ x ∣ + ∣ (1 - λ) y ∣$ .

Since $λ \geq 0$ and $1 - λ \geq 0$ , this becomes $λ ∣ x ∣ + (1 - λ) ∣ y ∣$ . So, $f (λ x + (1 - λ) y) \leq λ f (x) + (1 - λ) f (y)$ .

However, $∣ x ∣$ is not strictly convex (e.g., take $x = - 1, y = 1, λ = 1/2$ ; then $f (0) = 0$ and $\frac{1}{2} f (- 1) + \frac{1}{2} f (1) = \frac{1}{2} (1) + \frac{1}{2} (1) = 1$ . Here $0 < 1$ . But if $x = 1, y = 2, λ = 1/2$ , $f (1.5) = 1.5$ and $\frac{1}{2} f (1) + \frac{1}{2} f (2) = 0.5 + 1 = 1.5$ . Equality can hold for non-distinct points on a straight segment of the graph). It’s not strictly convex because, for instance, if $x$ and $y$ are both positive, the secant line is the function itself between $x$ and $y$ .

Remark: Jensen’s Inequality (Generalized Convexity)

If $f : I \to R$ is convex, then for any $n \geq 1$ points $x_{1}, \dots, x_{n} \in I$ and any non-negative weights $λ_{1}, \dots, λ_{n}$ that sum to $1$ ( $\sum λ_{i} = 1$ ), Jensen’s inequality holds: $f (\sum_{i = 1}^{n} λ_{i} x_{i}) \leq \sum_{i = 1}^{n} λ_{i} f (x_{i})$ The definition we gave is just the $n = 2$ case.

Lemma: Convexity and Slopes of Secant Lines

A function $f : I \to R$ is convex if and only if for any three points $x_{0} < x < x_{1}$ in $I$ , the slope of the secant line from $(x_{0}, f (x_{0}))$ to $(x, f (x))$ is less than or equal to the slope of the secant line from $(x, f (x))$ to $(x_{1}, f (x_{1}))$ : $(*) \frac{f ( x ) - f ( x _{0} )}{x - x _{0}} \leq \frac{f ( x _{1} ) - f ( x )}{x _{1} - x}$ For strict convexity, the inequality in $(*)$ is strict ( $<$ ) when $x_{0}, x, x_{1}$ are distinct.

Why this matters: This condition means that as you move from left to right, the slopes of the secant lines are non-decreasing. This is a key characteristic.

Why are convex functions important? They appear everywhere, especially in optimization. If you’re trying to minimize a convex function, any local minimum you find is guaranteed to be a global minimum. This makes finding the “best” solution much easier, and many efficient algorithms rely on this property.

Quick Exercise: If $f : I \to R$ is convex, prove that any local minimum of $f$ is also a global minimum on $I$ .

How Can We Tell if a Function is Convex? (Using Derivatives)

If our function is differentiable, the derivative gives us a powerful way to check for convexity.

Theorem: Convexity and the First Derivative

Let $f : I \to R$ be a differentiable function. Then:

$f$ is convex on $I ⟺ f^{'}$ (its derivative) is monotonically increasing on $I$ .
$f$ is strictly convex on $I ⟺ f^{'}$ is strictly monotonically increasing on $I$ .

Proof Sketch (for $f^{'}$ increasing $⟹ f$ convex)

We want to use the slope condition from the previous Lemma. Take any $x_{0} < x < x_{1}$ in $I$ . We need to show $\frac{f ( x ) - f ( x _{0} )}{x - x _{0}} \leq \frac{f ( x _{1} ) - f ( x )}{x _{1} - x}$ .

By the Mean Value Theorem, there’s an $η \in (x_{0}, x)$ such that $\frac{f ( x ) - f ( x _{0} )}{x - x _{0}} = f^{'} (η)$ . And there’s a $ξ \in (x, x_{1})$ such that $\frac{f ( x _{1} ) - f ( x )}{x _{1} - x} = f^{'} (ξ)$ .

Since $x_{0} < η < x < ξ < x_{1}$ , we have $η < ξ$ . If $f^{'}$ is monotonically increasing, then $f^{'} (η) \leq f^{'} (ξ)$ . This directly gives us $\frac{f ( x ) - f ( x _{0} )}{x - x _{0}} \leq \frac{f ( x _{1} ) - f ( x )}{x _{1} - x}$ , which by the Lemma means $f$ is convex.

Corollary: Convexity and the Second Derivative

Now, if $f$ is twice differentiable on $I$ (meaning $f^{'}$ exists and is itself differentiable, giving $f^{''}$ ):

If $f^{''} (x) \geq 0$ for all $x \in I$ , then $f$ is convex on $I$ .
If $f^{''} (x) > 0$ for all $x \in I$ , then $f$ is strictly convex on $I$ .

Proof

If $f^{''} (x) \geq 0$ for all $x \in I$ , then the derivative of $f^{'}$ (which is $f^{''}$ ) is non-negative. By the consequences of the MVT (Corollary 4.2.5, part 3), this means $f^{'}$ is monotonically increasing.

Then, by the theorem we just proved (Theorem 4.2.16), if $f^{'}$ is monotonically increasing, $f$ is convex. The argument for strict convexity is similar.

Example: Is $f (x) = - ln (x)$ convex on $(0, \infty)$ ?

Let’s check its second derivative. $f (x) = - ln (x)$ $f^{'} (x) = - \frac{1}{x}$ $f^{''} (x) = - (- 1) x^{- 2} = \frac{1}{x ^{2}}$

Since $x \in (0, \infty)$ , $x^{2}$ is always positive, so $f^{''} (x) = \frac{1}{x ^{2}} > 0$ for all $x \in (0, \infty)$ .

Therefore, $f (x) = - ln (x)$ is strictly convex on $(0, \infty)$ .

Continue here: 19 Higher Order Derivatives, Smooth Functions, Power Series and Taylor Approximation