Jenson’s Inequality.

琴生不等式(Jenson’s Inequality)由丹麦数学家Johan Jensen于$1906$年证明。该不等式描述了凸函数中的不等式关系,有着广泛的应用。

1. 两点式

设函数$f(x)$是区间$I$内的凸(convex)函数 (即$f’‘(x)>0$),则对$\forall x_1,x_2 \in I$及$\lambda \in (0,1)$,都有:

\[\lambda f(x_1) + (1-\lambda)f(x_2) \geq f(\lambda x_1 + (1-\lambda)x_2)\]

⚪ 证明:Taylor展开

根据Taylor展开

\[f(x) = f(x_0) + f'(x_0)(x-x_0) + \frac{f''(\xi)}{2}(x-x_0)^2\]

由于$f’‘(x)>0$,因此有:

\[f(x) \geq f(x_0) + f'(x_0)(x-x_0)\]

取$x_0=\lambda x_1 + (1-\lambda)x_2$,

\[f(x) \geq f(\lambda x_1 + (1-\lambda)x_2) + f'(\lambda x_1 + (1-\lambda)x_2)(x-\lambda x_1 - (1-\lambda)x_2)\]

分别令$x=x_1,x_2$得:

\[\begin{aligned} f(x_1) &\geq f(\lambda x_1 + (1-\lambda)x_2) + f'(\lambda x_1 + (1-\lambda)x_2)(1-\lambda)(x_1 - x_2) \\ f(x_2) &\geq f(\lambda x_1 + (1-\lambda)x_2) + f'(\lambda x_1 + (1-\lambda)x_2)\lambda(x_2 - x_1) \end{aligned}\]

取上述两式的凸组合则有:

\[\lambda f(x_1) + (1-\lambda)f(x_2) \geq f(\lambda x_1 + (1-\lambda)x_2)\]

2. 一般式

设函数$f(x)$是区间$I$内的凸(convex)函数 (即$f’‘(x)>0$),正实数$\lambda_1,\lambda_2,\cdots,\lambda_n$满足$\sum_{i=1}^{n}\lambda_i=1$,则对$\forall x_1,x_2, \cdots,x_n \in I$,都有:

\[\sum_{i=1}^{n}\lambda_i f(x_i) \geq f(\sum_{i=1}^{n}\lambda_i x_i)\]

⚪ 证明:数学归纳法

当$n=2$时显然成立(两点式)。假设对$n-1$的情况成立,下面证明对$n$的情况也成立。

设:

\[\mu = \lambda_2+\lambda_3+\cdots +\lambda_n > 0\]

则有:

\[\frac{\lambda_2}{\mu}+\frac{\lambda_3}{\mu}+\cdots +\frac{\lambda_n}{\mu} = 1\]

因此根据$n-1$情况下的琴生不等式,对$\forall x_2, \cdots,x_n \in I$有:

\[\frac{\lambda_2}{\mu}f(x_2)+\frac{\lambda_3}{\mu}f(x_3)+\cdots +\frac{\lambda_n}{\mu}f(x_n) \geq f(\frac{\lambda_2}{\mu}x_2+\frac{\lambda_3}{\mu}x_3+\cdots +\frac{\lambda_n}{\mu}x_n)\]

设:

\[\lambda_1 + \mu = 1\]

则对$x_1$及$\frac{\lambda_2}{\mu}x_2+\frac{\lambda_3}{\mu}x_3+\cdots +\frac{\lambda_n}{\mu}x_n$,根据$n=2$情况下的琴生不等式有:

\[\begin{aligned} \lambda_1f(x_1)+\mu f(\frac{\lambda_2}{\mu}x_2+\frac{\lambda_3}{\mu}x_3+\cdots +\frac{\lambda_n}{\mu}x_n) & \geq f(\lambda_1x_1+\mu (\frac{\lambda_2}{\mu}x_2+\frac{\lambda_3}{\mu}x_3+\cdots +\frac{\lambda_n}{\mu}x_n)) \\ & = f(\lambda_1x_1+\lambda_2x_2+\cdots \lambda_nx_n) \end{aligned}\]

结合上述两个不等式有:

\[\lambda_1f(x_1)+\mu (\frac{\lambda_2}{\mu}f(x_2)+\frac{\lambda_3}{\mu}f(x_3)+\cdots +\frac{\lambda_n}{\mu}f(x_n)) \geq f(\lambda_1x_1+\lambda_2x_2+\cdots \lambda_nx_n)\]

整理可得:

\[\lambda_1f(x_1)+\lambda_2f(x_2)+\cdots +\lambda_nf(x_n) \geq f(\lambda_1x_1+\lambda_2x_2+\cdots \lambda_nx_n)\]

3. 概率式

设函数$f(x)$是凸函数,$x$是随机变量,则有:

\[\Bbb{E}[f(x)] \geq f(\Bbb{E}[x])\]

⚪ 证明

构造函数$f(x)$在点$\Bbb{E}[x]$处的切线$l(x)=ax+b$,则有:

\[f(\Bbb{E}[x]) = l(\Bbb{E}[x])=a\Bbb{E}[x]+b\]

由于$f(x)$是凸函数,则有:

\[f(x) \geq l(x)\]

对上式取期望:

\[\Bbb{E}[f(x)] \geq \Bbb{E}[l(x)] = \Bbb{E}[ax+b] = a\Bbb{E}[x]+b = l(\Bbb{E}[x]) = f(\Bbb{E}[x])\]