Little-oh, the derivative, Taylor's formula

Table of Contents

Derivatives and little-oh

Real-valued functions on the real line

Definition: A real-valued function \(\varphi\) defined for arbitrarily small values of \(h\) is \(o(h)\) for \(h \to 0\) iff

$$ \lim_{h \to 0} \frac{\varphi(h)}{h} = 0 $$

(Lang, 1997, p. 67-68)

Result: A real-valued function \(f\) is differentiable at \(x\) if and only if there exists some number \(L\) and a function \(\varphi\) which is \(o(h)\) for \(h \to 0\) such that for all \(h\) in some neighborhood of \(x\),

$$ f(x + h) = f(x) + Lh + \varphi(h) \qquad (1) $$

(Lang, 1997, p. 67-68)

Proof:

The following proof is from Lang (1997, p. 67-68).

\((\Longrightarrow)\)

Assume \(f\) is differentiable at \(x\). Let

$$ \varphi(h) = \begin{cases} f(x + h) - f(x) - f^\prime(h) h, & h \neq 0 \\ 0, & h = 0 \end{cases} $$

Observe that

$$ \lim_{h \to 0} \frac{\varphi(h)}{h} = \lim_{h \to 0} \frac{f(x + h) - f(x)}{h} - f^\prime(x) = 0 $$

so \(\varphi\) is \(o(h)\) as \(h \to 0\). For \(h \neq 0\), \(f(x+h) = f(x) + Lh + \varphi(h)\) where we have let \(L = f^\prime(x)\). For \(h = 0\), the left-hand side of \((1)\) is \(f(x)\) and the right-hand side is \(f(x) + \varphi(0) = f(x)\).

\((\Longleftarrow)\)

Assume there exists \(L \in \mathbb{R}\) and \(\varphi\) that is \(o(h)\) for \(h \to 0\) such that \(f(x + h) = f(x) + Lh + \varphi(h)\). This implies

$$ \frac{f(x + h) - f(x)}{h} = L + \frac{\varphi(h)}{h} $$

The limit of the right-hand side as \(h \to 0\) exists and is \(L + 0 = L\), so the limit of the left-hand side exists and also must equal \(L\) since limits are unique. Taking the limit gives \(f^\prime(x) = L\), so \(f\) is differentiable at \(x\).

Note on proof of the chain rule for real-valued functions on the real line

The following discussion is from Lang (1997, p. 68):

Note that defining

$$ \psi(h) := \begin{cases} \frac{\varphi(h)}{h}, & h \neq 0 \\ 0, & h = 0 \end{cases} $$

we have that

$$ \psi(h) h = \begin{cases} \varphi(h), & h \neq 0 \\ 0, & h = 0 \end{cases} $$

If we simply define \(\varphi(0) = 0\), we then can just write \(\psi(h) h = \varphi(h)\).

Note that \(\lim_{h \to 0} \psi(h) = 0\).

This quantity is used in the proof of the chain rule in Lang (1997, p. 68).

Real-valued functions on \(\mathbb{R}^n\)

Definition: A real-valued function \(\varphi\) defined for all sufficiently small vectors \(h \in \mathbb{R}^n\), \(h \neq 0\) is said to be \(o(h)\) for \(h \to 0\) if and only if

$$ \lim_{h \to 0} \frac{\varphi(h)}{\norm{h}} = 0 $$

(Lang, 1997, p. 379)

Similar to above, if we define \(\varphi(0) = 0\), we have \(\psi(h) \norm{h} = \varphi(h)\) where \(\lim_{h \to 0} \psi(h) = 0\) (Lang, 1997, p. 379).

Similar to the above, for \(U \subseteq \mathbb{R}^n\), we say \(f: U \to R\) is differentiable at \(x\) if there exists \(A \in \mathbb{R}^n\) such that

$$ f(x + h) = f(x) + A \cdot h + \varphi(h) $$

or

$$ f(x + h) = f(x) + A \cdot h + o(h) \qquad (h \to 0) $$

\(A\) is the derivative of \(f\) (the gradient in this case) (Lang, 1997, p.380).

Functions whose domain and codomain are normed vector spaces

Definition: Let \(U\) be open in \(E\) and let \(x \in U\). Let \(f: U \to F\) be a map. We say \(f\) is differentiable at \(x\) iff there exists a continuous linear map \(\lambda: E \to F\) and a map \(\psi\) defined for all sufficiently small \(h\) in \(E\), with values in \(F\), such that \(\lim_{h \to 0} \psi(h) = 0\) and such that

$$ f(x + h) = f(x) + \lambda(h) + \norm{h} \psi(h) \qquad (2) $$

(Lang, 1997, p. 463)

As observed in Lang, 463, for \(h = 0\), assuming that \(\psi\) is defined at \(0\) and that \(\psi(0) = 0\) contradicts nothing we have said so far.

Similar to above, defining a map \(\varphi: E \to F\) for which

$$ \lim_{h \to 0} \frac{\varphi(h)}{\norm{h}} = 0 $$

we could write \((2)\) as \(f(x + h) = f(x) + \lambda(h) + \varphi(h)\) or simply

$$ f(x + h) = f(x) + \lambda(h) + o(h) \qquad (h \to 0) $$

(Lang, 1997, p. 463-464)

It can be shown (Lang, 1997, p. 463-464) that if the continuous linear map \(\lambda\) exists satisfying \((2)\), then it is uniquely determined by \(f\) and \(x\). This map is called the derivative of \(f\) at \(x\) and is denoted by \(f^\prime(x)\) or \(Df(x)\).

Taylor formula results

Let \(L(E, F)\) be the space of continuous linear maps from \(E\) into \(F\). It is a vector space (Lang, 1997, p. 456).

Definition: If \(f\) is differentiable at every point \(x\) of an open set \(U\) of \(E\), then we say \(f\) is differentiable on \(U\) and in that case, the derivative is a map from \(U\) to \(L(E, F)\) (Lang, 1997, p. 465).

Recall that the second derivative, if it exists, is a function from \(U\) into \(L(E, L(E, F))\) (Lang, 1997, p. 477). Similarly, the \(k\)th derivative, \(D^k\), defined by \(D^k f(x) = D(D^{k-1} f)(x)\) is a function from \(E\) into \(L(E, L(E, \ldots, L(E, F) \ldots))\) (Lang, 1997, p. 487-488).

Definition: Let \(E\), \(F\) be normed vector spaces, and let \(U \subseteq E\). For \(f: U \to F\), we say that \(f\) is of class \(C^p\) iff \(D^k f(x)\) exists for each \(x \in U\) and \(D^k f: U \to L^k(E, F)\) is continuous for each \(k = 0, \ldots, p\) (Lang, 1997, p. 487).

Theorem (Taylor's formula): Let \(U\) be open in \(E\) and let \(U \to F\) be of class \(C^p\). Let \(x \in U\) and \(y \in E\) such that the segment \(x + ty\), \(0 \leq t \leq 1\), is contained in \(U\). Denote by \(y^{(k)}\) the \(k\)-tuple \((y, y, \ldots, y)\). Then

$$ f(x + y) = f(x) + \frac{Df(x)y}{1!} + \cdots + \frac{D^{p-1}f(x)y^{(p-1)}}{(p-1)!} + R_p $$

where

$$ R_p = \int_0^1 \frac{(1-t)^{p-1}}{(p-1)!} D^p f(x + ty) y^{(p)} dt $$

(Lang, 1997, p. 490).

Under the same assumptions, there exists a \(t\) such that, letting \(z = x + ty\),

$$ \begin{align} f(y) & = f(x) + \frac{Df(x)(y-x)}{1!} + \cdots + \frac{D^{p-1}f(x)(y-x)^{(p-1)}}{(p-1)!} \notag\\ & \quad\quad + \frac{1}{p!} D^p f(z) (y-x)^{(p)} \notag \end{align} $$

(Güler, 2010, p. 15)

Making use of the above two forms of Taylor's thorem, we have the following specific cases:

For \(U \subseteq \mathbb{R}^n\), \(f: U \to \mathbb{R}\), if \(f\) is differentiable, for all \(x, y \in U\) there exists \(z_1\) on the line segment between \(x\) and \(y\) such that

\(f(y) = f(x) + \inner{\nabla f(z_1)}{y - x}\)

and also

\(f(y) = f(x) + \inner{\nabla f(x)}{y - x} + \varphi(y - x)\)

where \(\varphi(y - x)\) is \(o(y - x)\) as \(y \to x\).

If \(f\) has continuous 2nd-order partial derivatives, for all \(x, y \in U\) there exists \(z_2\) on the line segment between \(x\) and \(y\) such that

\(f(y) = f(x) + \inner{\nabla f(x)}{y - x} + \frac{1}{2} (y - x)^T H f(z_2) (y - x)\),

and

\(f(y) = f(x) + \inner{\nabla f(x)}{y - x} + \frac{1}{2} (y - x)^T H f(x) (y - x) + \varphi(y - x)\)

where \(\lim_{y \to x} (\varphi(y - x) / \norm{y - x}^2)\) as \(y \to x\).

(Güler, 2010, p. 16)

One convention is to write the above two expressions that involve \(\varphi\) is something like the following: \(\varphi\) is \(o(\norm{y - x})\) as \(y \to x\), and \(\varphi\) is \(o(\norm{y - x}^2)\) as \(y \to x\). Note that this requires an extra step of intervention: for example, the second of the two expressions really means \(\lim_{y \to x} \varphi(y - x) / \norm{y - x}^2\), as opposed to \(\lim_{y \to x} \varphi(\norm{y - x}^2) / \norm{y - x}^2\), which is what we would have if we applied our usual definition of little-oh to the statement. Thus care must be taken when using this convention.

Little-oh of a sequence

Recall the fundamental result: for \(f: E \to F\), with \(a \in F\), \(L \in F\),

$$ \left(\forall (t_n) \quad \left[\lim_{n \to \infty} t_n = a \implies \lim_{n \to \infty} f(t_n) = L\right]\right) \iff \left(\lim_{t \to a} f(t) = L\right) $$

For a sequence \((x_n)\) and function \(\varphi\) we say \(\varphi(x_n)\) is \(o(x_n)\) as \(n \to \infty\) iff

$$ \lim_{n \to \infty} \frac{\varphi(x_n)}{\norm{x_n}} = 0 $$

Thus for a \(\varphi(h)\) that is \(o(h)\) as \(h \to 0\) and an \((x_n)\) where \(x_n \to 0\) as \(n \to \infty\), we have \(\lim_{n \to \infty} g(x_n)\) is \(o(x_n)\) as \(n \to \infty\).

References

Güler, Osman. (2010). Foundations of optimization. Springer Science+Business Media, Inc.

Lang, Serge. (1997). Undergraduate analysis (Second ed.). Springer Science+Business Media, Inc.

How to cite this article

Wayman, Eric Alan. (2025). Little-oh, the derivative, Taylor's formula. Eric Alan Wayman's technical notes. https://ericwayman.net/notes/little-oh-deriv-taylor/

@misc{wayman2025little-oh-deriv-taylor,
  title={Little-oh, the derivative, Taylor's formula},
  author={Wayman, Eric Alan},
  journal={Eric Alan Wayman's technical notes},
  url={https://ericwayman.net/notes/little-oh-deriv-taylor/},
  year={2025}
}

© 2025-2026 Eric Alan Wayman. All rights reserved.