L-smoothness and strong convexity
Table of Contents
Prerequisites
Convexity
Definition: Let \(U \subseteq \mathbb{R}^n\). The epigraph of a function \(f: U \to \mathbb{R}\) is a subset of \(\mathbb{R}^{n+1}\) defined as
Theorem: \(f\) is a convex function iff its epigraph is a convex set.
Proof: see Ahmadi (2025b).
Notation: Let \(A \succeq 0\) denote \(A\) is positive semidefinite, and \(A \succ 0\) denote \(A\) is positive definite.
Theorem: Let \(f: \mathbb{R}^n \to \mathbb{R}\). \(f\) is twice-differentiable over an open domain, then the following are equivalent:
- \(f\) is convex
- For all \(x, y \in \text{dom}(f)\), \(f(y) \geq f(x) + \inner{\nabla f(x)}{y - x}\)
- \(\nabla^2 f(x) \succeq 0\)
(Recall that \(\inner{\nabla f(x)}{y - x}\) is the directional derivative of \(f\) at \(x\) in the direction of \(y - x\).)
(2) means the first-order Taylor expansion of \(f\) at \(x\) lies beneath the function at every point in its domain. (3) means that the function has nonnegative curvature everywhere.
Proof: see Ahmadi (2025c).
First and second order conditions
(necessary first-order condition) For a continuous differentiable \(f\), if \(w^\ast\) is a local minimum of \(f\), then \(\nabla f(w^\ast) = 0\).
Recall that there are critical points \(w\) (i.e. points \(w\) for which \(\nabla f(w) = 0\)) that are not local minima or local maxima.
(second-order necessary condition) For \(f \in C^2\), then if \(w^\ast\) is a local minimum of \(f\) then \(\nabla f(w^\ast) = 0\) and \(\nabla^2 f(w^\ast) \succeq 0\).
(second-order sufficient condition) For \(f \in C^2\), if \(\nabla f(w^\ast) = 0\) and \(\nabla^2 f(w^\ast) \succ 0\) then \(w^\ast\) is a local minimum of \(f\).
(Ahmadi, 2025a; Royer, 2022)
Types of functions
Definition: \(f: \mb{R}^n \to \mb{R}^m\) is \(L\)-Lipschitz continuous with Lipschitz constant \(L > 0\) iff
(Royer, 2024, p. 8)
Note that if \(f\) is \(L\)-Lipschitz continuous, then \(f\) is continuous.
Definition: \(f: \mb{R}^d \to \mb{R}\) is continously differentiable iff its first-order derivative exists and is continuous. We denote the set of continuously differentiable functions by \(\mc{C}^{1}(\mb{R}^d)\) (Royer, 2024, p. 9).
Definition: \(f: \mb{R}^d \to \mb{R}\) is twice continously differentiable iff \(f \in \mc{C}^{1}(\mb{R}^d)\) and the secnond-order derivative of \(f\) exists and is continuous. We denote the set of twice continuously differentiable functions by \(\mc{C}^{2}(\mb{R}^d)\) (Royer, 2024, p. 9).
Definition: Given \(L > 0\), the set \(\mc{C}_{L}^{1,1}(\mb{R}^d)\) represents the set of all functions \(f\) in \(\mc{C}^{1}(\mb{R}^d)\) such that \(\nabla f\) is \(L\)-Lipschitz continuous (Royer, 2024, p. 9).
Definition: Given \(L > 0\), the set \(\mc{C}_{L}^{2,2}(\mb{R}^d)\) represents the set of all functions \(f\) in \(\mc{C}^{2}(\mb{R}^d)\) such that \(\nabla^2 f\) is \(L\)-Lipschitz continuous (Royer, 2024, p. 9).
Strong convexity
Definition: \(f: \mb{R}^n \to \mb{R}\) is strongly convex iff there exists \(\exists\, \alpha > 0\) such that \(g\) defined by \(g(x) = f(x) - \alpha ||x||^2\) is convex (Ahmadi, 2025c).
Result: Strong convexity implies convexity (Ahmadi, 2025c).
\(L\)-Smoothness
Definition: \(f: \mathbb{R}^n \to \mathbb{R}\) is \(L\)-smooth with respect to a norm \(\norm{\cdot}\) if for all \(x, y \in \text{int dom}(f)\)
where \(\norm{\cdot}_\ast\) is the sup norm of \(\norm{\cdot}\) (Fawzi, 2023). In what follows, we assume \(\norm{\cdot}\) is the Euclidean norm, so \(\norm{\cdot}_\ast = \norm{\cdot}\). In what follows we will assume the norm is Euclidean.
Using this restricted definition, observe that \(f\) is \(L\)-smooth \(\iff f \in \mc{C}_{L}^{1,1}(\mb{R}^d)\).
Lemmas for \(L\)-smooth functions
Descent lemma
Descent lemma: If \(f\) is \(L\)-smooth, then for any \(x \in \text{int dom}(f)\) and \(y \in \text{dom}(f)\),
(Fawzi, 2023)
Proof:
This proof is directly from Fawzi (2023) with minor clarifications added.
Let \(h = y - x\) and \(\phi(t) = f(x + th) - (f(x) + t \inner{\nabla f(x)}{h})\). Since \(f\) is differentiable, \(\phi\) is differentiable.
Using the remainder version of Taylor's theorem,
which is what we wanted to prove.
An application of the Lemma (this is my filling in of the parts not shown of Remark 1 from Fawzi, 2023):
Let \(y = x - \frac{1}{L} \nabla f(x)\). Then
This shows that when the gradient method (i.e. iterating with \(x_{k+1} = x_k - t_k \nabla f(x)\)) is used for an \(L\)-smooth function with step size \(t_k = 1/L\), each step results in a decrease of the function's value (Fawzi, 2023).
Proof that the dual norm of the Euclidean norm is the Euclidean norm itself
Note the supremum is attained and is the unit vector in the direction of \(h\), so
Another Lemma
Lemma: Assume \(f: U \to \mathbb{R}\) with \(\text{dom}(f)\) open and \(f\) twice continuously differentiable on \(U\). Then \(f\) is \(L\)-smooth if and only if
(Fawzi, 2023, Proposition 3.1)
Proof:
The following proof is from Fawzi (2023) with me filling in the gaps.
\((\Longleftarrow)\)
Let \(x, y \in \text{dom}(f)\). We use a theorem from Lang (1997, p. 475):
Theorem: Let \(U\) be open in \(E\) and let \(x \in U\). Let \(y \in E\). Let \(f: U \to F\) by a \(C^1\) map. Assume that the line segment \(x + ty\) with \(0 \leq t \leq 1\) is contained in \(U\). Then
Replace \(y\) with \(h\):
Now let \(y = x + h\), so we have
For \(f\) real-valued, \(f^\prime (x + th)\) is \(\nabla f(x + th)\).
Taking the derivative of both sides with respect to \(x\) gives
Let \(y = x + h\). Then taking the norm gives
so \(f\) is \(L\)-smooth.
\((\Longrightarrow)\)
Assume \(f\) is \(L\)-smooth. Let \(u, v \in \mathbb{R}^n\). Define \(\psi(t) := \inner{\nabla f(x + tu) - \nabla f(x)}{v}\). Note that
Therefore
so
Also since \(\psi(t) = \inner{\nabla f(x + tu) - \nabla f(x)}{v}\),
Write \(\beta(t) = x + tu = z\) and \(\alpha(z) = \nabla f(z)\). Then
so \(\psi^\prime(t) = \inner{\nabla^2 f(x + tu) u}{v}\) and thus \(\psi^\prime(0) = \inner{\nabla^2 f(x) u}{v}\). Therefore \(\inner{\nabla^2 f(x) u}{v} \leq L \norm{u} \norm{v}\).
\(\square\)
Eigenvalues of the Hessian of an \(L\)-smooth function
Result: The \(L\)-smooth Lemma condition holds implies that all eigenvalues of \(\nabla^2 f(x)\) are in \([-L, L]\).
(Fawzi 2023, Remark 3)
Proof (my work):
Assume the \(L\)-smooth Lemma condition holds. Let \(u\) be an eigenvector of \(\nabla^2 f(x)\) and \(\lambda\) be the corresponding eigenvalue. Then for any vector \(v\),
\(\inner{\nabla^2 f(x) y}{v} = \inner{\lambda u}{v} = \lambda \inner{u}{v} \leq L \norm{u} \norm{v}\)
Case 1: \(\lambda > 0\). We have
and thus \(\lambda \leq L\).
Also, certainly \(L \geq \lambda \cos(\theta) \geq \lambda (-1) \iff \lambda \geq -L\).
Case 2: \(\lambda < 0\). Again,
Also, \(L \geq \lambda \cos(\theta) \geq \lambda (1) = \lambda\).
So in both cases, \(\lambda \in [-L, L]\).
\(\square\)
\(m\)-strong convexity
Definition
Let \(U \subseteq \mathbb{R}^n\). We say that \(f: \mathbb{R}^n \to \mathbb{R}\) is \(m\)-strongly convex (with respect to the norm \(\norm{\cdot}\)) iff for any \(x, y \in \text{dom}(f)\) and \(t \in [0, 1]\),
(Fawzi, 2023, Section 3.3)
Lemma 1
Lemma: \(f\) is \(m\)-strongly convex and differentiable iff for all \(x, y \in \text{dom}(f)\),
(Fawzi, 2023, Section 3.3)
Proof:
This was assigned as an Exercise in Fawzi (2023, Section 3.3). The following is my answer.
\((\Longrightarrow)\)
Assume \(f\) is \(m\)-convex and differentiable. Since \(f\) is differentiable, using the definition of differentiable,
where \(\varphi(h)\) is \(o(h)\) as \(h \to 0\).
[Claim: If \(\varphi(h)\) is \(o(h)\) as \(h \to 0\), then \(\varphi(th)\) is \(o(t)\) as \(t \to 0\) (namely, that \(\lim_{t \to 0} (\varphi(th) / \abs{t}) = 0\)).
Proof: \(\varphi(h)\) is \(o(h)\) as \(h \to 0\) is equivalent to
Let \(\varepsilon > 0\). We want to find a \(\delta > 0\) such that \(\abs{t} < \delta \implies \abs{\varphi(th)} < \abs{t} \cdot \varepsilon\). Note
Thus letting \(\delta = \varepsilon / (\norm{h} \alpha)\) we have that \(\lim_{t \to 0} (\varphi(th) / \abs{t}) = 0\), i.e. \(\varphi(th)\) is \(o(t)\) as \(t \to 0\). \(\square\)]
Thus we have
Therefore, since \(f(x + th) \leq (1 - t) f(x) + t f(y) - \frac{m}{2} t (1 - t) \norm{x - y}^2\),
which implies
(we have divided by \(t \neq 0\)). Now taking the limit of both sides as \(t \to 0\)
which is the desired expression.
\((\Longleftarrow)\)
Assume for all \(x, y \in \text{dom}(f)\),
Using the same techinque in the proof of the convexity hyperplane condition (Ahmadi, 2025c), we have for \(z = t x + (1 - t) y\),
the desired condition.
\(\square\)
Lemma 2
Lemma: If \(f\) is twice continuously differentiable on its domain (assumed open), then \(f\) is \(m\)-strongly convex if and only if \(\inner{\nabla^2 f(x) h}{h} \geq m \norm{h}^2\).
(Fawzi, 2023, Section 3.3)
Proof:
This was assigned as an Exercise in Fawzi (2023, Section 3.3). The following is my answer.
\((\Longrightarrow)\)
Assume \(f\) is \(m\)-strongly convex. Define
Observe that \(\phi(t) \geq 0 \iff\)
so (taking \(y\) as \(x + th\) in the definition) strong convexity of \(f\) implies \(\forall t \quad \phi(t) \geq 0\).
Observe \(\phi(0) = 0\). Therefore \(\phi^{\prime\prime}(0) \geq 0\). Also,
so \(\phi^{\prime\prime}(0) \geq 0\) is \(\inner{h}{\nabla^2 f(x) h} \geq m \norm{h}^2\).
\((\Longleftarrow)\)
Assume \(\forall\, z \in \text{dom}(f) \enspace \forall\, h \in \mathbb{R}^n \quad \inner{h}{\nabla^2 f(x) h} \geq 0\).
Letting \(\phi(t)\) be defined as above, we have \(\forall \, t \quad \phi^{\prime\prime}(t) \geq 0\). Therefore \(\forall \, t \quad \phi^\prime(t)\) is increasing.
Since \(\phi(0) = \phi^\prime(0) = 0\), we have \(\phi(1) \geq 0\), i.e.
Lemma 3
Lemma: A convex function \(f\) is \(L\)-smooth on \(U\) iff for all \(x \in U\), \(\nabla^2 f(x) \preceq LI\) (i.e. \(LI - \nabla^2 f(x)\) is positive semidefinite), i.e. all the eigenvalues of \(f\) are \(\leq L\). Similarly, a function \(f\) is \(m\)-strongly convex on \(U\) iff for all \(x \in U\), \(\nabla^2 f(x) \succeq mI\), i.e. all the eigenvalues of \(f\) are \(\geq m\).
(Fawzi, 2023, Remark 4)
Proof:
This was not proven in Fawzi, 2023. The following is my proof.
Let \(x \in U\). \(f\) is \(L\)-smooth at \(x\) iff
\(\nabla^2 f(x) \succeq mI \iff \nabla^2 f(x) - mI \succeq 0\)
\(\iff \forall\, h \in \mathbb{R}^n \setminus \{0\} \quad \inner{h}{(\nabla^2 f(x) - mI) h} \geq 0\)
Let \(u\) be an eigenvector of \(\nabla^2 f(x)\) with \(\lambda\) its corresponding eigenvalue. Then we have
so \(\lambda \geq m\).
Similarly, \(\nabla^2 f(x) \preceq LI\) iff
so \(\lambda \leq L\).
References
Ahmadi, Amir Ali. (2025a). Convex and conic optimization, Lecture 3 [Lecture notes].
Ahmadi, Amir Ali. (2025b). Convex and conic optimization, Lecture 4 [Lecture notes].
Ahmadi, Amir Ali. (2025c). Convex and conic optimization, Lecture 7 [Lecture notes].
Fawzi, Hamza. (2023). Topics in convex optimization, Lecture 3: Smoothness and strong convexity [Lecture notes].
Lang, Serge. (1997). Undergraduate analysis (Second ed.). Springer Science+Business Media, Inc.
Royer, Clément W. (2022). Lecture notes on advanced gradient descent [Lecture notes].
How to cite this article
Wayman, Eric Alan. (2025). L-smoothness and strong convexity. Eric Alan Wayman's technical notes. https://ericwayman.net/notes/L-smoothness-strong-convexity/
@misc{wayman2025L-smoothness-strong-convexity,
title={L-smoothness and strong convexity},
author={Wayman, Eric Alan},
journal={Eric Alan Wayman's technical notes},
url={https://ericwayman.net/notes/L-smoothness-strong-convexity/},
year={2025}
}