Jenson’s Inequality.
琴生不等式(Jenson’s Inequality)由丹麦数学家Johan Jensen于$1906$年证明。该不等式描述了凸函数中的不等式关系,有着广泛的应用。
1. 两点式
设函数$f(x)$是区间$I$内的凸(convex)函数 (即$f’‘(x)>0$),则对$\forall x_1,x_2 \in I$及$\lambda \in (0,1)$,都有:
\[\lambda f(x_1) + (1-\lambda)f(x_2) \geq f(\lambda x_1 + (1-\lambda)x_2)\]⚪ 证明:Taylor展开
根据Taylor展开:
\[f(x) = f(x_0) + f'(x_0)(x-x_0) + \frac{f''(\xi)}{2}(x-x_0)^2\]由于$f’‘(x)>0$,因此有:
\[f(x) \geq f(x_0) + f'(x_0)(x-x_0)\]取$x_0=\lambda x_1 + (1-\lambda)x_2$,
\[f(x) \geq f(\lambda x_1 + (1-\lambda)x_2) + f'(\lambda x_1 + (1-\lambda)x_2)(x-\lambda x_1 - (1-\lambda)x_2)\]分别令$x=x_1,x_2$得:
\[\begin{aligned} f(x_1) &\geq f(\lambda x_1 + (1-\lambda)x_2) + f'(\lambda x_1 + (1-\lambda)x_2)(1-\lambda)(x_1 - x_2) \\ f(x_2) &\geq f(\lambda x_1 + (1-\lambda)x_2) + f'(\lambda x_1 + (1-\lambda)x_2)\lambda(x_2 - x_1) \end{aligned}\]取上述两式的凸组合则有:
\[\lambda f(x_1) + (1-\lambda)f(x_2) \geq f(\lambda x_1 + (1-\lambda)x_2)\]2. 一般式
设函数$f(x)$是区间$I$内的凸(convex)函数 (即$f’‘(x)>0$),正实数$\lambda_1,\lambda_2,\cdots,\lambda_n$满足$\sum_{i=1}^{n}\lambda_i=1$,则对$\forall x_1,x_2, \cdots,x_n \in I$,都有:
\[\sum_{i=1}^{n}\lambda_i f(x_i) \geq f(\sum_{i=1}^{n}\lambda_i x_i)\]⚪ 证明:数学归纳法
当$n=2$时显然成立(两点式)。假设对$n-1$的情况成立,下面证明对$n$的情况也成立。
设:
\[\mu = \lambda_2+\lambda_3+\cdots +\lambda_n > 0\]则有:
\[\frac{\lambda_2}{\mu}+\frac{\lambda_3}{\mu}+\cdots +\frac{\lambda_n}{\mu} = 1\]因此根据$n-1$情况下的琴生不等式,对$\forall x_2, \cdots,x_n \in I$有:
\[\frac{\lambda_2}{\mu}f(x_2)+\frac{\lambda_3}{\mu}f(x_3)+\cdots +\frac{\lambda_n}{\mu}f(x_n) \geq f(\frac{\lambda_2}{\mu}x_2+\frac{\lambda_3}{\mu}x_3+\cdots +\frac{\lambda_n}{\mu}x_n)\]设:
\[\lambda_1 + \mu = 1\]则对$x_1$及$\frac{\lambda_2}{\mu}x_2+\frac{\lambda_3}{\mu}x_3+\cdots +\frac{\lambda_n}{\mu}x_n$,根据$n=2$情况下的琴生不等式有:
\[\begin{aligned} \lambda_1f(x_1)+\mu f(\frac{\lambda_2}{\mu}x_2+\frac{\lambda_3}{\mu}x_3+\cdots +\frac{\lambda_n}{\mu}x_n) & \geq f(\lambda_1x_1+\mu (\frac{\lambda_2}{\mu}x_2+\frac{\lambda_3}{\mu}x_3+\cdots +\frac{\lambda_n}{\mu}x_n)) \\ & = f(\lambda_1x_1+\lambda_2x_2+\cdots \lambda_nx_n) \end{aligned}\]结合上述两个不等式有:
\[\lambda_1f(x_1)+\mu (\frac{\lambda_2}{\mu}f(x_2)+\frac{\lambda_3}{\mu}f(x_3)+\cdots +\frac{\lambda_n}{\mu}f(x_n)) \geq f(\lambda_1x_1+\lambda_2x_2+\cdots \lambda_nx_n)\]整理可得:
\[\lambda_1f(x_1)+\lambda_2f(x_2)+\cdots +\lambda_nf(x_n) \geq f(\lambda_1x_1+\lambda_2x_2+\cdots \lambda_nx_n)\]3. 概率式
设函数$f(x)$是凸函数,$x$是随机变量,则有:
\[\Bbb{E}[f(x)] \geq f(\Bbb{E}[x])\]⚪ 证明
构造函数$f(x)$在点$\Bbb{E}[x]$处的切线$l(x)=ax+b$,则有:
\[f(\Bbb{E}[x]) = l(\Bbb{E}[x])=a\Bbb{E}[x]+b\]由于$f(x)$是凸函数,则有:
\[f(x) \geq l(x)\]对上式取期望:
\[\Bbb{E}[f(x)] \geq \Bbb{E}[l(x)] = \Bbb{E}[ax+b] = a\Bbb{E}[x]+b = l(\Bbb{E}[x]) = f(\Bbb{E}[x])\]