AA/EE/ME 578 Review

# Convex Sets

## Examples

### Subspaces

$$S \subseteq \mathbf{R}^{n}$$ is a subspace if $\displaylines{x, y \in S, \quad \lambda, \mu \in \mathbf{R} \quad \Longrightarrow \lambda x+\mu y \in S }$ Geometrically: $$x, y \in S \Rightarrow$$ plane through $$0, x, y \subseteq S$$.

### Affine Sets

$$S \subseteq \mathbf{R}^n$$ is affine if $\displaylines{x, y \in S, \quad \lambda, \mu \in \mathbf{R}, \quad \lambda+\mu=1 \Longrightarrow \lambda x+\mu y \in S }$ Geometrically: $$x, y \in S \Rightarrow$$ line through $$x, y \subseteq S$$.

### Convex Sets

$$S \subseteq \mathbf{R}^n$$ is a convex set if $\displaylines{x, y \in S, \quad \lambda, \mu \geq 0, \quad \lambda+\mu=1 \Longrightarrow \lambda x+\mu y \in S }$ Geometrically: $$x, y \in S \Rightarrow$$ segment $$[x, y] \subseteq S$$.

### Convex Cone

$$S \subseteq \mathbf{R}^n$$ is a cone if $\displaylines{x \in S, \quad \lambda \geq 0 \Longrightarrow \lambda x \in S }$ $$S \subseteq \mathbf{R}^n$$ is a convex cone if $\displaylines{x, y \in S, \quad \lambda, \mu \geq 0 \Longrightarrow \lambda x+\mu y \in S }$ Geometrically: $$x, y \in S \Rightarrow$$ 'pie slice' between $$x, y \subseteq S$$.

### Combinations and Hulls

$$y=\theta_{1} x_{1}+\cdots+\theta_{k} x_{k}$$ is a

• linear combination of $$x_1, \dots, x_k$$
• affine combination if $$\sum \theta_i = 1$$
• convex combination if $$\sum \theta_i = 1,\ \theta_i \ge 0$$
• conic combination if $$\theta_i \ge 0$$.

(linear, ...) hull of $$S$$ is the set of all (linear, ...) combinations from $$S$$. $\displaylines{\operatorname{conv}(S)=\bigcap\{G \mid S \subseteq G, \ G \text { convex }\} }$

convex hull $$\operatorname{conv}(S)$$ is the set of all convex combinations of points in $$S$$.

### Hyperplanes and Halfspaces

• Hyperplane: $$\left\{x \mid a^{T} x=b\right\}(a \neq 0)$$
• Halfspace: $$\left\{x \mid a^{T} x \leq b\right\}(a \neq 0)$$

$$a$$ is the normal vector. Hyperplanes are affine and convex; halfspaces are convex.

### Euclidean Balls and Ellipsoids

Euclidean ball with center $$x_c$$ and radius $$r$$: $\displaylines{B\left(x_{c}, r\right)=\left\{x \mid \left\|x-x_{c}\right\|_{2} \leq r\right\}=\left\{x_{c}+r u \mid \|u\|_{2} \leq 1\right\} }$ Ellipsoid: $\displaylines{\left\{x \mid \left(x-x_{c}\right)^{T} P^{-1}\left(x-x_{c}\right) \leq 1\right\} }$ with $$P \in \mathbf{S}_{++}^{n}$$. Other representation: $$\left\{x_{c}+A u \mid \|u\|_{2} \leq 1\right\}$$ with $$A$$ square and nonsingular.

### Norm Balls and Norm Cones

Norm: a function $$\|\cdot\|$$ that satisfies

• $$\|x\| \geq 0 ; \ \|x\|=0$$ if and only if $$x=0$$
• $$\|t x\|=|t|\|x\|$$ for $$t \in \mathbf{R}$$
• $$\|x+y\| \leq\|x\|+\|y\|$$

Norm ball with center $$x_c$$ and radius $$r$$: $$\left\{x \mid \left\|x-x_{c}\right\| \leq r\right\}$$

Norm cone: $$\{(x, t) \mid \|x\| \leq t\}$$

### Polyhedron

Solution set of finitely many linear inequalities and equalities $\displaylines{A x \preceq b, \qquad C x=d }$ ($$A \in \mathbf{R}^{m \times n}, \ C \in \mathbf{R}^{p \times n}, \ \preceq$$ is component-wise inequality)

Polyhedron is intersection of finite number of halfspaces and hyperplanes.

### Positive Semidefinite Cone

• $$\mathbf{S}_{+}^{n}=\left\{X \in \mathbf{S}^{n} \mid X \succeq 0\right\}$$: PSD matrices. A convex cone.
• $$\mathbf{S}_{++}^{n}=\left\{X \in \mathbf{S}^{n} \mid X\succ 0\right\}$$: PD matrices.

## Operations That Preserve Convexity

To show $$C$$ is convex set

1. Definition $\displaylines{x_{1}, x_{2} \in C, \quad 0 \leq \theta \leq 1 \quad \Longrightarrow \quad \theta x_{1}+(1-\theta) x_{2} \in C }$

2. Show that $$C$$ is obtained from simple convex sets by operations that preserve convexity

• intersection
• affine function
• perspective function
• linear-fractional functions

### Intersection

The intersection of (any number of, even infinite) convex sets is convex.

### Affine Function

Suppose $$f : \mathbf{R}^{n} \rightarrow \mathbf{R}^{m}$$ is affine ($$f(x)=A x+b$$ with $$A \in \mathbf{R}^{m \times n}, \ b \in \mathbf{R}^{m}$$)

• The image of a convex set under $$f$$ is convex $\displaylines{S \subseteq \mathbf{R}^{n} \text { convex } \Longrightarrow f(S)=\{f(x) \mid x \in S\} }$

• The inverse image $$f^{-1}(C)$$ of a convex set under $$f$$ is convex $\displaylines{C \subseteq \mathbf{R}^{m} \text{ convex } \quad \Longrightarrow \quad f^{-1}(C)=\left\{x \in \mathbf{R}^{n} \mid f(x) \in C\right\} \text{ convex } }$

### Perspective and Linear-Fractional Function

Perspective function $$P : \mathbf{R}^{n+1} \rightarrow \mathbf{R}^{n}$$ $\displaylines{P(x, t)=x / t, \quad \operatorname{dom} P=\{(x, t) \mid t>0\} }$ images and inverse images of convex sets under perspective are convex.

Linear-fractional function $$f : \mathbf{R}^{n} \rightarrow \mathbf{R}^{m}$$ $\displaylines{f(x)=\frac{A x+b}{c^{T} x+d}, \qquad \text { dom } f=\left\{x \mid c^{T} x+d>0\right\} }$ images and inverse images of convex sets under linear-fractional functions are convex.

## Generalized Inequalities

A convex cone $$K \subseteq \mathbf{R}^{n}$$ is a proper cone if

• $$K$$ is closed (contains its boundary)
• $$K$$ is solid (has nonempty interior)
• $$K$$ is pointed (contains no line)

Generalized inequality defined by a proper cone $$K$$: $\displaylines{x \preceq_{K} y \ \Longleftrightarrow \ y-x \in K, \qquad x\preceq_{K} y \ \Longleftrightarrow \ y-x \in \operatorname{int} K }$

## Hyperplane Theorem

### Separating Hyperplane

If $$C$$ and $$D$$ are disjoint convex sets, then there exists $$a \ne 0, \ b$$ such that $\displaylines{a^{T} x \leq b \ \text { for }\ x \in C, \qquad a^{T} x \geq b \ \text { for } \ x \in D }$ the hyperplane $$\left\{x \mid a^{T} x=b\right\}$$ separates $$C$$ and $$D$$. Strict separation requires additional assumptions (e.g. $$C$$ is closed, $$D$$ is a singleton).

### Supporting Hyperplane

Supporting hyperplane to set $$C$$ at boundary point $$x_0$$: $\displaylines{\left\{x \mid a^{T} x=a^{T} x_{0}\right\} }$ where $$a \ne 0$$ and $$a^T x \le a^T x_0$$ for all $$x \in C$$.

Supporting hyperplane theorem: If $$C$$ is convex, then there exists a supporting hyperplane at every boundary point of $$C$$.

## Dual Cones

Dual cone of a cone $$K$$: $$K^*=\left\{y \mid y^{T} x \geq 0 \text { for all } x \in K\right\}$$

Examples:

• $$K=\mathbf{R}_{+}^{n} : K^{*}=\mathbf{R}_{+}^{n}$$
• $$K=\mathbf{S}_{+}^{n} : K^{*}=\mathbf{S}_{+}^{n}$$
• $$K=\left\{(x, t) \mid\|x\|_{2} \leq t\right\} : K^{*}=\left\{(x, t) \mid\|x\|_{2} \leq t\right\}$$
• $$K=\left\{(x, t) \mid\|x\|_{1} \leq t\right\} : K^{*}=\left\{(x, t) \mid\|x\|_{\infty} \leq t\right\}$$

# Convex Functions

$$f : \mathbf{R}^{n} \rightarrow \mathbf{R}$$ is convex if $$\operatorname{dom} f$$ is a convex set and $\displaylines{f(\theta x+(1-\theta) y) \leq \theta f(x)+(1-\theta) f(y) }$ for all $$x, y \in \operatorname{dom} f, \ 0\le \theta \le 1$$

## Examples

### Convex

• affine: $$a^T x + b$$
• exponential: $$e^{ax}$$, for any $$a \in \mathbf{R}$$
• powers: $$x^\alpha$$ on $$\mathbf{R}_{++}$$, for $$\alpha \ge 1$$ or $$a \le 0$$
• powers of absolute value: $$|x|^p$$ on $$\mathbf{R}$$, for $$p \ge 1$$
• negative entropy: $$x\log x$$ on $$\mathbf{R}_{++}$$
• norms: $$\|x\|_{p}=\left(\sum_{i=1}^{n}\left|x_{i}\right|^{p}\right)^{1 / p}$$ for $$p \ge 1$$
• affine on matrices: $$f(X)=\operatorname{tr}\left(A^{T} X\right)+b$$
• spectral norm: $$f(X) = \|X\|_2 = \sigma_{\max}(X)$$
• quadratic: $$f(x)=(1 / 2) x^{T} P x+q^{T} x+r$$ with $$P \in \mathbf{S}^{n}$$
• least-squares: $$f(x) = \|Ax - b\|_2^2$$
• quadratic-over-linear: $$f(x, y) = x^2/y$$ with $$y > 0$$
• log-sum-exp: $$f(x)=\log \sum_{k=1}^{n} \exp x_{k}$$

### Concave

• $$f$$ in concave if $$-f$$ is convex
• affine
• powers: $$x^\alpha$$ on $$\mathbf{R}_{++}$$, for $$0\le \alpha \le 1$$
• logarithm: $$\log x$$ on $$\mathbf{R}_{++}$$
• $$\log\det X$$ on $$\mathbf{S}_{++}^n$$
• geometric mean: $$f(x)=\left(\prod_{k=1}^{n} x_{k}\right)^{1 / n}$$ on $$\mathbf{R}_{++}^n$$

## Properties

### Restriction of a Convex Function to a Line

$$f : \mathbf{R}^{n} \rightarrow \mathbf{R}$$ is convex iff the function $$g : \mathbf{R}^{n} \rightarrow \mathbf{R}$$ $\displaylines{g(t)=f(x+t v), \qquad \operatorname{dom} g=\{t \mid x+t v \in \operatorname{dom} f\} }$

is convex (in $$t$$) for any $$x \in \operatorname{dom} f, \ v\in \mathbf{R}^n$$

### First-Order Convexity Condition

Differentiable $$f$$ with convex domain is convex iff $\displaylines{f(y) \geq f(x)+\nabla f(x)^{T}(y-x) \quad \text { for all } x, y \in \operatorname{dom} f }$

### Second-Order Convexity Condition

Twice differentiable $$f$$ with convex domain is convex iff $\displaylines{\nabla^{2} f(x) \succeq 0 \quad \text { for all } x \in \operatorname{dom} f }$

### Epigraph and Sublevel Set

$$\alpha$$-sublevel set of $$f : \mathbf{R}^{n} \rightarrow \mathbf{R}$$: $\displaylines{C_{\alpha}=\{x \in \operatorname{dom} f \mid f(x) \leq \alpha\} }$ sublevel sets of convex functions are convex (converse is false)

Epigraph of $$f : \mathbf{R}^{n} \rightarrow \mathbf{R}$$: $\displaylines{\operatorname{epi}f=\left\{(x, t) \in \mathbf{R}^{n+1} \mid x \in \operatorname{dom} f, \ f(x) \leq t\right\} }$ $$f$$ is convex iff $$\operatorname{epi}f$$ is a convex set.

### Jensen's Inequality

If $$f$$ is convex, then for $$0 \le \theta \le 1$$ $\displaylines{f(\theta x+(1-\theta) y) \leq \theta f(x)+(1-\theta) f(y) }$ can be extended to $\displaylines{f(\mathbf{E}\ z) \leq \mathbf{E}\ f(z) }$ for any random variable $$z$$

## Operations That Preserve Convexity

To show $$f$$ is convex function:

1. Verify definition (often simplified by restricting to a line)
2. For twice differentiable functions, show $$\nabla^2 f(x) \succeq 0$$
3. Show that $$f$$ is obtained from simple convex functions by operations that preserve convexity
• Nonnegative multiple: $$\alpha f$$ is convex if $$f$$ is convex, $$\alpha \ge 0$$

• Sum: $$f_1 + f_2$$ is convex if $$f_1, f_2$$ convex

• Composition with affine function: $$f(Ax + b)$$ is convex if $$f$$ is convex

• Pointwise maximum: if $$f_1, \dots, f_m$$ are convex, then $$f(x)=\max \left\{f_{1}(x), \ldots, f_{m}(x)\right\}$$ is convex

• Pointwise supremum: if $$f(x, y)$$ is convex in $$x$$ for each $$y\in \mathcal{A}$$, then $$g(x)=\sup _{y \in \mathcal{A}} f(x, y)$$ is convex.

• Composition of $$g : \mathbf{R}^{n} \rightarrow \mathbf{R}$$ and $$h : \mathbf{R} \rightarrow \mathbf{R}$$: $$f(x) = h(g(x))$$ is convex if
• $$g$$ convex, $$h$$ convex, $$\tilde{h}$$ nondecreasing
• $$g$$ concave, $$h$$ convex, $$\tilde{h}$$ nonincreasing
• Composition of $$g : \mathbf{R}^{n} \rightarrow \mathbf{R}^k$$ and $$h : \mathbf{R}^k \rightarrow \mathbf{R}$$: $$f(x) = h(g(x)) = h(g_1(x), \dots, g_k(x))$$ is convex if
• $$g_i$$ convex, $$h$$ convex, $$\tilde{h}$$ nondecreasing in each argument
• $$g_i$$ concave, $$h$$ convex, $$\tilde{h}$$ nonincreasing in each argument
• Minimization: if $$f(x, y)$$ is convex in $$(x, y)$$ and $$C$$ is convex set, then $$g(x)=\inf _{y \in C} f(x, y)$$ is convex

• Perspective if a function $$f : \mathbf{R}^{n} \rightarrow \mathbf{R}$$ is the function $$g : \mathbf{R}^{n} \times \mathbf{R} \rightarrow \mathbf{R}$$ $\displaylines{ g(x, t)=t f(x / t), \qquad \operatorname{dom} g=\{(x, t) \mid x / t \in \operatorname{dom} f, t>0\} }$ $$g$$ is convex if $$f$$ is convex

## Conjugate Function

The conjugate of a function $$f$$: $$f^{*}(y)=\sup _{x \in \operatorname{dom} f}\left(y^{T} x-f(x)\right)$$ is always convex.

## Quasiconvex Function

$$f : \mathbf{R}^{n} \rightarrow \mathbf{R}$$ is quasiconvex if $$\operatorname{dom} f$$ is convex and the sublevel sets $\displaylines{S_{\alpha}=\{x \in \operatorname{dom} f \mid f(x) \leq \alpha\} }$ are convex for all $$\alpha$$

$$f$$ is quasiconcave if $$-f$$ is quasiconvex.

• Modified Jensen inequality: $$0 \leq \theta \leq 1 \ \Longrightarrow \ f(\theta x+(1-\theta) y) \leq \max \{f(x), f(y)\}$$
• First-order condition: $$f(y) \leq f(x) \quad \Longrightarrow \quad \nabla f(x)^{T}(y-x) \leq 0$$
• Sums of quasiconvex functions are not necessarily quasiconvex

## Log-Concave and Log-Convex Function

A positive function $$f$$ is log-concave if $$\log f$$ is concave: $\displaylines{f(\theta x+(1-\theta) y) \geq f(x)^{\theta} f(y)^{1-\theta} \quad \text { for }\ 0 \leq \theta \leq 1 }$ Many common probability densities are log-concave, e.g., normal distribution.

• Second-order condition: $$f(x) \nabla^{2} f(x) \preceq \nabla f(x) \nabla f(x)^{T}$$
• Product of log-concave functions is log-concave
• Sum of log-concave functions is not always log-concave
• Integration: if $$f : \mathbf{R}^{n} \times \mathbf{R}^{m} \rightarrow \mathbf{R}$$ is log-concave, then $$\displaystyle g(x)=\int f(x, y) dy$$ is log-concave

# Convex Optimization Problems

## Optimization Problem

$\displaylines{\begin{array}{cl} \operatorname{minimize} & {f_{0}(x)} \\ \text { subject to } & {f_{i}(x) \leq 0, \quad i=1, \ldots, m} \\ & {h_{i}(x)=0, \quad i=1, \ldots, p}\end{array} }$

• $$x \in \mathbf{R}^n$$ is the optimization variable
• $$f_0: \mathbf{R}^n \rightarrow \mathbf{R}$$ is the objective or cost function
• $$f_0: \mathbf{R}^n \rightarrow \mathbf{R}, i = 1, \dots, m$$ are the inequality constraint functions
• $$h_i: \mathbf{R}^n \rightarrow \mathbf{R}$$ are the equality constraint functions

Optimal value $$p^{\star}=\inf \left\{f_{0}(x) \mid f_{i}(x) \leq 0,\ i=1, \ldots, m,\ h_{i}(x)=0, \ i=1, \ldots, p\right\}$$

• $$p^\star = \infty$$ if problem is infeasible
• $$p^\star = -\infty$$ if problem is unbounded below

## Feasibility Problem

$\displaylines{\begin{array}{cl}{\operatorname{minimize}} & {0} \\ {\text { subject to }} & {f_{i}(x) \leq 0, \quad i=1, \ldots, m} \\ {} & {h_{i}(x)=0, \quad i=1, \ldots, p}\end{array} }$

• $$p^\star = 0$$ if constraints are feasible; any feasible $$x$$ is optimal
• $$p^\star = \infty$$ if constraints are infeasible

## Convex Optimization

$\displaylines{\begin{array}{ll}{\operatorname{minimize}} & {f_{0}(x)} \\ {\text { subject to }} & {f_{i}(x) \leq 0, \quad i=1, \ldots, m} \\ {} & Ax = b\end{array} }$

• $$f_0, \dots, f_m$$ are convex; equality constraints are affine
• Feasible set of a convex optimization problem is convex
• Any locally optimal point of a convex problem is globally optimal

### Optimality Criterion

$$x$$ is optimal iff it is feasible and $$\nabla f_{0}(x)^{T}(y-x) \geq 0$$ for all feasible $$y$$. If nonzero, $$\nabla f_0(x)$$ defines a supporting hyperplane to feasible set $$X$$ at $$x$$.

• unconstrained problem: $\displaylines{x \in \operatorname{dom} f_{0}, \quad \nabla f_{0}(x)=0 }$

• equality constrained problem: $\displaylines{\operatorname{minimize}\ f_{0}(x) \quad \operatorname{subject to} \ A x=b }$ x is optimal iff there exists a $$\nu$$ such that $\displaylines{x \in \operatorname{dom} f_{0}, \quad A x=b, \quad \nabla f_{0}(x)+A^{T} \nu=0 }$

• minimization over nonnegative orthant: $\displaylines{\operatorname{minimize}\ f_{0}(x) \quad \operatorname{subject to} \ x \succeq 0 }$ x is optimal iff $\displaylines{x \in \operatorname{dom} f_{0}, \qquad x \succeq 0, \qquad\left\{\begin{array}{ll}{\nabla f_{0}(x)_{i} \geq 0} & {x_{i}=0} \\ {\nabla f_{0}(x)_{i}=0} & {x_{i}>0}\end{array}\right. }$

### Equivalent Convex Problems

• eliminating equality constrains $\displaylines{\begin{array}{ll}{\operatorname{minimize}\ (\text {over } z)} & {f_{0}\left(F z+x_{0}\right)} \\ {\text {subject to }} & {f_{i}\left(F z+x_{0}\right) \leq 0, \quad i=1, \ldots, m}\end{array} }$ where $$F$$ and $$x_0$$ such that $$A x=b \ \Longleftrightarrow \ x=F z+x_{0}$$ for some $$z$$

• introducing slack variables for linear inequalities $\displaylines{\begin{array}{ll}{\operatorname{minimize}} & {f_{0}(x)} \\ {\text { subject to }} & {a_{i}^{T} x \leq b_{i}, \quad i=1, \ldots, m}\end{array} }$ is equivalent to $\displaylines{\begin{array}{ll}{\operatorname{minimize}\ (\text{over } x, s)} & {f_{0}(x)} \\ {\text {subject to }} & {a_{i}^{T} x+s_{i}=b_{i}, \quad i=1, \ldots, m} \\ {} & {s_{i} \geq 0, \quad i=1, \ldots m}\end{array} }$

• epigraph form: standard form convex problem is equivalent to $\displaylines{\begin{array}{ll}{\operatorname{minimize}\ (\text {over } x, t)} & {t} \\ {\text {subject to }} & {f_{0}(x)-t \leq 0} \\ {} & {f_{i}(x) \leq 0, \quad i=1, \ldots, m} \\ {} & {A x=b}\end{array} }$

## Quasiconvex Optimization

$\displaylines{\begin{array}{cl}{\operatorname{minimize}} & {f_{0}(x)} \\ {\text { subject to }} & {f_{i}(x) \leq 0, \quad i=1, \ldots, m} \\ {} & {A x=b}\end{array} }$

with $$f_0: \mathbf{R}^n \rightarrow \mathbf{R}$$ quasiconvex, $$f_1, \dots, f_m$$ convex. Can have locally optimal points that are not globally optimal

If $$f_0$$ is quasiconvex, there exists a family of functions $$\phi_t$$ such that:

• $$\phi_t(x)$$ is convex in $$x$$ for fixed $$t$$
• $$t$$-sublevel set of $$f_0$$ is $$0$$-sublevel set of $$\phi_t$$

For a fixed $$t$$, the quasiconvex optimization problem can be transferred to a convex feasibility problem in $$x$$. Bisection method can be used to find the optimal $$t$$.

## Linear Optimization

### Linear Programming

$\displaylines{\begin{array}{cl}{\text { minimize }} & {c^{T} x+d} \\ {\text { subject to }} & {G x \preceq h} \\ {} & {A x=b}\end{array} }$

• feasible set is a polyhedron.

### Linear-Fractional Programming

$\displaylines{\begin{array}{cl}{\text { minimize }} & {f_{0}(x)} \\ {\text { subject to }} & {G x \preceq h} \\ {} & {A x=b}\end{array} }$

where $\displaylines{f_{0}(x)=\frac{c^{T} x+d}{e^{T} x+f}, \qquad \operatorname{dom} f_{0}(x)=\left\{x \mid e^{T} x+f>0\right\} }$

• a quasiconvex optimization problem

• equivalent to the LP $\displaylines{\begin{array}{cl}{\text { minimize }} & {c^{T} y+d z} \\ {\text { subject to }} & {G y \preceq h z} \\ {} & {A y=b z} \\ {} & {e^{T} y+f z=1} \\ {} & {z \geq 0}\end{array} }$

$\displaylines{\begin{array}{cl}{\operatorname{minimize}} & {(1 / 2) x^{T} P x+q^{T} x+r} \\ {\text { subject to }} & {G x \preceq h} \\ {} & {A x=b}\end{array} }$

• $$P \in \mathbf{S}_+^n$$, so objective is convex quadratic
• minimize a convex quadratic function over a polyhedron

$\displaylines{\begin{array}{ll}{\operatorname{minimize}} & {(1 / 2) x^{T} P_{0} x+q_{0}^{T} x+r_{0}} \\ {\text { subject to }} & {(1 / 2) x^{T} P_{i} x+q_{i}^{T} x+r_{i} \leq 0, \quad i=1, \ldots, m} \\ {} & {A x=b}\end{array} }$

• $$P_i \in \mathbf{S}_+^n$$, objective and constraints are convex quadratic
• if $$P_1, \dots, P_m \in \mathbf{S}_{++}^n$$, feasible region is intersection of $$m$$ ellipsoids and an affine set

### Second-Order Cone Programming

$\displaylines{\begin{array}{cl}{\operatorname{minimize}} & {f^{T} x} \\ {\text { subject to }} & {\left\|A_{i} x+b_{i}\right\|_{2} \leq c_{i}^{T} x+d_{i}, \quad i=1, \ldots, m} \\ {} & {F x=g}\end{array} }$

• $$A_{i} \in \mathbf{R}^{n_{i} \times n}, F \in \mathbf{R}^{p \times n}$$
• inequalities are called second-order cone constraints
• for $$n_i = 0$$, reduces to an LP; if $$c_i = 0$$, reduces to a QCQP

## Geometric Programming

• Monomial function $\displaylines{f(x)=c x_{1}^{a_{1}} x_{2}^{a_{2}} \cdots x_{n}^{a_{n}}, \qquad \text { dom } f=\mathbf{R}_{++}^{n} }$ with $$c>0,\ \alpha_i \in \mathbf{R}$$

• Posynomial function: sum of monomials $\displaylines{f(x)=\sum_{k=1}^{K} c_{k} x_{1}^{a_{1 k}} x_{2}^{a_{2 k}} \cdots x_{n}^{a_{n k}}, \qquad \operatorname{dom} f=\mathbf{R}_{++}^{n} }$

• Geometric program $\displaylines{\begin{array}{ll}{\text { minimize }} & {f_{0}(x)} \\ {\text { subject to }} & {f_{i}(x) \leq 1, \quad i=1, \ldots, m} \\ {} & {h_{i}(x)=1, \quad i=1, \ldots, p}\end{array} }$ with $$f_i$$ posynomial, $$h_i$$ monomial

Geometric program in convex form: $\displaylines{\begin{array}{cl}{\operatorname{minimize}} & {\log \left(\sum_{k=1}^{K} \exp \left(a_{0 k}^{T} y+b_{0 k}\right)\right)} \\ {\text { subject to }} & {\log \left(\sum_{k=1}^{K} \exp \left(a_{i k}^{T} y+b_{i k}\right)\right) \leq 0, \quad i=1, \ldots, m} \\ {} & {G y+d=0}\end{array} }$

## Generalized Inequality Constraints

$\displaylines{\begin{array}{ll}{\operatorname{minimize}} & {f_{0}(x)} \\ {\text { subject to }} & {f_{i}(x) \preceq_{K_{i}} 0, \quad i=1, \ldots, m} \\ {} & {A x=b}\end{array} }$

• $$f_0: \mathbf{R}^n \rightarrow \mathbf{R}$$ convex; $$f_i: \mathbf{R}^n \rightarrow \mathbf{R}^{k_i}$$ $$K_i$$-convex w.r.t. proper cone $$K_i$$
• same properties as standard convex problem (convex feasible set, local optimum is global, etc.)

Conic form problem: $\displaylines{\begin{array}{cl} \operatorname{minimize} & c^{T} x \\ \text{subject to} & F x+g \preceq_{K} 0 \\ & A x=b \end{array} }$ extends LP ($$K = \mathbf{R}_+^m$$) to non-polyhedral cones

Semidefinite programming: $\displaylines{\begin{array}{cl} {\operatorname{minimize}} & {c^{T} x} \\ {\text { subject to }} & {x_{1} F_{1}+x_{2} F_{2}+\cdots+x_{n} F_{n}+G \preceq 0} \\ {} & {A x=b} \end{array} }$ with $$F_i, G \in \mathbf{S}^k$$

### Minimum and Minimal Elements

$$\preceq_K$$ is not in general a linear ordering: we can have $$x \npreceq_K y$$ and $$y \npreceq_K x$$

• $$x \in S$$ is the minimum element of $$S$$ w.r.t. $$\preceq_K$$ if $$y \in S \ \Longrightarrow \ x \preceq_K y$$
• $$x \in S$$ is the minimal element of $$S$$ w.r.t. $$\preceq_K$$ if $$y \in S, \ y \preceq_{K} x \ \Longrightarrow \ y=x$$

## Vector Optimization

General vector optimization problem: $\displaylines{\begin{array}{ll} \text{minimize (w.r.t. } K ) & f_{0}(x) \\ \text{subject to} & f_{i}(x) \leq 0, \quad i=1, \ldots, m \\ & h_{i}(x) \leq 0, \quad i=1, \ldots, p \end{array} }$ vector objective $$f_{0} : \mathbf{R}^{n} \rightarrow \mathbf{R}^{q}$$, minimized w.r.t. proper cone $$K \in \mathbf{R}^{q}$$

Convex vector optimization problem: $\displaylines{\begin{array}{ll} \text{minimize (w.r.t. } K ) & f_{0}(x) \\ \text{subject to} & f_{i}(x) \leq 0, \quad i=1, \ldots, m \\ & A x=b \end{array} }$

with $$f_0$$ $$K$$-convex, $$f_1,\dots, f_m$$ convex

### Optimal and Pareto Optimal Points

Set of achievable objective values $$\mathcal{O}=\left\{f_{0}(x) \mid x\ \text { feasible}\right\}$$

• feasible $$x$$ is optimal if $$f_0(x)$$ is a minimum value of $$\mathcal{O}$$
• feasible $$x$$ is Pareto optimal if $$f_0(x)$$ is a minimal value of $$\mathcal{O}$$

# Duality

## Lagrange Dual Function

From the standard form optimization problem we define the Lagrangian $$L : \mathbf{R}^{n} \times \mathbf{R}^{m} \times \mathbf{R}^{p} \rightarrow \mathbf{R}$$ $\displaylines{L(x, \lambda, \nu)=f_{0}(x)+\sum_{i=1}^{m} \lambda_{i} f_{i}(x)+\sum_{i=1}^{p} \nu_{i} h_{i}(x) }$

• weighted sum of objective and constraint functions
• $$\lambda_i$$ is Lagrange multiplier associated with $$f_i(x) \le 0$$
• $$\nu_i$$ is Lagrange multiplier associated with $$h_i(x) = 0$$

Lagrange dual function $$g : \mathbf{R}^{m} \times \mathbf{R}^{p} \rightarrow \mathbf{R}$$ \displaylines{\begin{aligned} g(\lambda, \nu) &=\inf_{x \in \mathcal{D}} L(x, \lambda, \nu) \\ &=\inf_{x \in \mathcal{D}}\left(f_{0}(x)+\sum_{i=1}^{m} \lambda_{i} f_{i}(x)+\sum_{i=1}^{p} \nu_{i} h_{i}(x)\right) \end{aligned} } $$g$$ is concave.

lower bound property: if $$\lambda \succeq 0$$, then $$g(\lambda, \nu) \leq p^{\star}$$

## Lagrange Dual Problem

$\displaylines{\begin{array}{ll}{\text { maximize }} & {g(\lambda, \nu)} \\ {\text { subject to }} & {\lambda \succeq 0}\end{array} }$

• find best lower bound on $$p^\star$$, obtained from Lagrange dual function
• a convex optimization problem; optimal value denoted $$d^\star$$
• $$\lambda, \nu$$ are dual feasible if $$\lambda \succeq 0, (\lambda, \nu) \in \operatorname{dom} g$$
• often simplified by making implicit constraint $$(\lambda, \nu) \in \operatorname{dom} g$$ explicit

## Optimality Conditions

• Weak duality $$d^\star \le p^\star$$ always holds, can be expressed as $\displaylines{\sup_{\lambda \succeq 0} \inf_{x} L(x, \lambda) \leq \inf_x \sup_{\lambda \succeq 0} L(x, \lambda) }$

• Strong duality $$d^\star = p^\star$$ usually holds for convex problems. It means that $$x^\star$$ and $$\lambda^\star$$ from a saddle-point for the Lagrangian.

### Slater's Constraint Qualification

Strong duality holds for a convex problem if it is strictly feasible, i.e., $\displaylines{\exists x \in \operatorname{int} \mathcal{D} : \qquad f_{i}(x)<0, \quad i=1, \ldots, m, \qquad A x=b }$

### KKT Conditions

If strong duality holds and $$x^\star, \lambda^\star, \nu^\star$$ are optimal, then they must satisfy:

1. primal constraints: $$f_{i}(x^\star) \leq 0,\ h_{i}(x^\star)=0$$

2. dual constraints: $$\lambda^\star \succeq 0$$

3. complementary slackness: $$\lambda_i^\star f_i(x^\star) = 0$$

4. gradient of Lagrangian with respect to $$x$$ vanishes: $\displaylines{\nabla f_{0}(x^\star)+\sum_{i=1}^{m} \lambda_{i}^\star \nabla f_{i}(x^\star)+\sum_{i=1}^{p} \nu_{i}^\star \nabla h_{i}(x^\star)=0 }$

If $$\tilde{x}, \tilde{\lambda}, \tilde{\nu}$$ satisfy KKT for a convex problem, then they are optimal

# Applications

## Geometric Problems

### Minimum Volume Ellipsoid Around a Set

Minimum volume ellipsoid $$\mathcal{E}$$ of a set $$C$$.

• parametrize $$\mathcal{E}$$ as $$\mathcal{E}=\left\{v \mid\|A v+b\|_{2} \leq 1\right\}$$, assume $$A \in \mathbf{S}_{++}^n$$

• $$\operatorname{vol} \mathcal{E}$$ is proportional to $$\det A^{-1}$$, can compute $$\mathcal{E}$$ by solving $\displaylines{\begin{array}{ll}{\operatorname{minimize}}\ (\text {over } A, b) & {\log \operatorname{det} A^{-1}} \\ {\text {subject to }} & {\sup _{v \in C}\|A v+b\|_{2} \leq 1}\end{array} }$

### Maximum Volume Inscribed Ellipsoid

Maximum volume ellipsoid $$\mathcal{E}$$ inside a convex set $$C \subseteq \mathbf{R}^n$$

• parametrize $$\mathcal{E}$$ as $$\mathcal{E}=\left\{B u+d \mid\|u\|_{2} \leq 1\right\}$$, assume $$B \in \mathbf{S}_{++}^n$$

• $$\operatorname{vol} \mathcal{E}$$ is proportional to $$\det B$$, can compute $$\mathcal{E}$$ by solving $\displaylines{\begin{array}{ll}{\text { maximize }} & {\log \operatorname{det} B} \\ {\text { subject to }} & {\sup _{\|u\|_{2} \leq 1} I_{C}(B u+d) \leq 0}\end{array} }$ where $$I_C(x) = 0$$ for $$x\in C$$ and $$I_C(x) = \infty$$ for $$x\notin C$$

### Linear Discrimination

Separate two sets of points $$\{x_1, \dots, x_N\},\ \{y_1, \dots, y_M\}$$ by a hyperplane: $\displaylines{a^{T} x_{i}+b>0, \ i=1, \ldots, N, \qquad a^{T} y_{i}+b<0, \ i=1, \ldots, M }$ homogeneous in $$a, b$$, hence equivalent to $\displaylines{a^{T} x_{i}+b \geq 1, \quad i=1, \ldots, N, \qquad a^{T} y_{i}+b \leq-1, \quad i=1, \ldots, M }$ To separate two sets of points by maximum margin $\displaylines{\begin{array}{cl}{\operatorname{minimize}} & {(1 / 2)\|a\|_{2}} \\ {\text { subject to }} & {a^{T} x_{i}+b \geq 1, \quad i=1, \ldots, N} \\ {} & {a^{T} y_{i}+b \leq-1, \quad i=1, \ldots, M}\end{array} }$

### Support Vector Classifier

$\displaylines{\begin{array}{cl}{\text { minimize }} & {\|a\|_{2}+\gamma\left(\mathbf{1}^{T} u+\mathbf{1}^{T} v\right)} \\ {\text { subject to }} & {a^{T} x_{i}+b \geq 1-u_{i}, \quad i=1, \ldots, N} \\ {} & {a^{T} y_{i}+b \leq-1+v_{i}, \quad i=1, \ldots, M} \\ {} & {u \succeq 0, \quad v \succeq 0}\end{array} }$

produces point on trade-off curve between inverse of margin $$2/\|a\|_2$$ and classification error, measured by total slack $$\mathbf{1}^{T} u+\mathbf{1}^{T} v$$

## Data Fitting

### Norm Approximation

$\displaylines{\operatorname{minimize} \|A x-b\| }$

where $$A \in \mathbf{R}^{m \times n}$$ with $$m\ge n$$

Linear measurement model: $$y = Ax + v$$, $$y$$ are measurements, $$x$$ is unknown, $$v$$ is measurement error. Given $$y=b$$, best guess of $$x$$ is $$x^\star$$

### Least-Norm Problems

$\displaylines{\begin{array}{ll}{\text { minimize }} & {\|x\|} \\ {\text { subject to }} & {A x=b}\end{array} }$

where $$A \in \mathbf{R}^{m \times n}$$ with $$m\le n$$

### Scalarized Problem

$\displaylines{\operatorname{minimize} \|A x-b\|+\gamma\|x\| }$

## Statistical Estimation

### Maximum Likelihood Estimation

$\displaylines{\operatorname{maximize}\ (\text{over } x ) \quad \log p_{x}(y) }$ With linear measurement model with IID noise: $$y_{i}=a_{i}^{T} x+v_{i}, \ i=1, \ldots, m$$, the estimation problem becomes $\displaylines{\operatorname{maximize}\ l(x) = \sum_{i=1}^{m} \log p\left(y_{i}-a_{i}^{T} x\right) }$ where $$y$$ is observed value, $$p$$ is the PDF of the measurement noise $$v$$

• Gaussian noise $$\mathcal{N}\left(0, \sigma^{2}\right) : p(z)=\left(2 \pi \sigma^{2}\right)^{-1 / 2} e^{-z^{2} /\left(2 \sigma^{2}\right)}$$ $\displaylines{l(x)=-\frac{m}{2} \log \left(2 \pi \sigma^{2}\right)-\frac{1}{2 \sigma^{2}} \sum_{i=1}^{m}\left(a_{i}^{T} x-y_{i}\right)^{2} }$

• Laplacian noise $$p(z)=(1 /(2 a)) e^{-|z| / a}$$ $\displaylines{l(x)=-m \log (2 a)-\frac{1}{a} \sum_{i=1}^{m}\left|a_{i}^{T} x-y_{i}\right| }$

• Uniform noise on $$[-a, a]$$ $\displaylines{l(x)=\left\{\begin{array}{ll}{-m \log (2 a)} & {\left|a_{i}^{T} x-y_{i}\right| \leq a, \quad i=1, \ldots, m} \\ {-\infty} & {\text { otherwise }}\end{array}\right. }$

### Logistic Regression

Random variable $$y\in \{0,1\}$$ with distribution $\displaylines{p=\operatorname{prob}(y=1)=\frac{\exp \left(a^{T} u+b\right)}{1+\exp \left(a^{T} u+b\right)} }$ log-likelihood function (for $$y_1 = \cdots = y_k = 1, \ y_{k+1} = \cdots = y_m = 0$$): \displaylines{\begin{aligned} l(a, b) &=\log \left(\prod_{i=1}^{k} \frac{\exp \left(a^{T} u_{i}+b\right)}{1+\exp \left(a^{T} u_{i}+b\right)} \prod_{i=k+1}^{m} \frac{1}{1+\exp \left(a^{T} u_{i}+b\right)}\right) \\ &=\sum_{i=1}^{k}\left(a^{T} u_{i}+b\right)-\sum_{i=1}^{m} \log \left(1+\exp \left(a^{T} u_{i}+b\right)\right) \end{aligned} } concave in $$a, b$$ WeChat Pay Alipay Venmo