Fisher GAN:使用Fisher差异构造生成对抗网络.

本文作者通过积分概率度量(integral probability metrics, IPM)构造了真实数据分布$P_{data}(x)$和生成数据分布$P_G(x)$之间的Fisher差异,并进一步设计了Fisher GAN

1. Fisher差异

积分概率度量寻找满足某种限制条件的函数集合\(\mathcal{F}\)中的连续函数$f(\cdot)$,使得该函数能够提供足够多的关于矩的信息;然后寻找一个最优的\(f(x)\in \mathcal{F}\)使得两个概率分布$p(x)$和$q(x)$之间的差异最大,该最大差异即为两个分布之间的距离:

\[d_{\mathcal{F}}(p(x),q(x)) = \mathop{\sup}_{f(x)\in \mathcal{F}} \Bbb{E}_{x \text{~} p(x)}[f(x)]-\Bbb{E}_{x \text{~} q(x)}[f(x)]\]

Fisher差异(Fisher discrepancy)是一种受线性判别分析启发的IPM距离。

线性判别分析是指把样本集$x$进行投影$w^Tx$,从而让相同类别的样本投影尽可能接近(即类内方差$w^T(\Sigma_0+\Sigma_1)w$尽可能小)、不同类别的样本投影尽可能远离(即类间距离$w^T(\mu_0-\mu_1)(\mu_0-\mu_1)^Tw$尽可能大)。

类似地,Fisher差异不仅最大化两种概率分布的均值差异,还约束了概率分布的二阶矩。定义函数空间\(\mathcal{F}\)为如下形式:

\[\begin{aligned} \mathcal{F} = \{ f(x) : \frac{1}{2}(\Bbb{E}_{x \text{~} p(x)}[f^2(x)]+\Bbb{E}_{x \text{~} q(x)}[f^2(x)])=1 \} \end{aligned}\]

则对应的IPM距离为:

\[\begin{aligned} d_{\mathcal{F}}(p(x),q(x)) &= \mathop{\sup}_{f \in \mathcal{F}} \Bbb{E}_{x \text{~} p(x)}[f(x)]-\Bbb{E}_{x \text{~} q(x)}[f(x)] \\ &= \mathop{\sup}_{f,\frac{1}{2}(\Bbb{E}_{x \text{~} p(x)}[f^2(x)]+\Bbb{E}_{x \text{~} q(x)}[f^2(x)])=1} \Bbb{E}_{x \text{~} p(x)}[f(x)]-\Bbb{E}_{x \text{~} q(x)}[f(x)] \\ &= \mathop{\sup}_{f} \frac{\Bbb{E}_{x \text{~} p(x)}[f(x)]-\Bbb{E}_{x \text{~} q(x)}[f(x)]}{\sqrt{\frac{1}{2}(\Bbb{E}_{x \text{~} p(x)}[f^2(x)]+\Bbb{E}_{x \text{~} q(x)}[f^2(x)])}} \end{aligned}\]

2. Fisher GAN

Fisher GAN使用Fisher差异衡量真实数据分布$P_{data}(x)$和生成数据分布$P_G(x)$之间的距离,目标函数如下:

\[\begin{aligned} \mathop{ \min}_{G} \mathop{ \max}_{D} \frac{\Bbb{E}_{x \text{~} P_{data}(x)}[D(x)]-\Bbb{E}_{x \text{~} P_G(x)}[D(x)]}{\sqrt{\frac{1}{2}(\Bbb{E}_{x \text{~} P_{data}(x)}[D^2(x)]+\Bbb{E}_{x \text{~} P_G(x)}[D^2(x)])}} \end{aligned}\]