Understanding disentangling in β-VAE

使用信息瓶颈解释β-VAE的解耦表示能力.

β-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework

β-VAE:学习变分自编码器隐空间的解耦表示.

Attentional Feature Fusion

AFF:特征通道注意力融合.

Memory-Efficient Adaptive Optimization

SM3:内存高效的自适应优化算法.

Averaging Weights Leads to Wider Optima and Better Generalization

SWA:通过随机权重平均寻找更宽的极小值.

Decoupled Weight Decay Regularization

AdamW:解耦梯度下降与权重衰减正则化.