Adam: A Method for Stochastic Optimization

Adam:自适应矩估计.

On the importance of initialization and momentum in deep learning

Nesterov Momentum:一种动量梯度更新方法.

A Direct Adaptive Method for Faster Backpropagation Learning: The RPROP Algorithm

RProp:一种快速反向传播学习的直接自适应方法.

ADADELTA: An Adaptive Learning Rate Method

Adadelta:一种自适应学习率方法.

Don’t Decay the Learning Rate, Increase the Batch Size

通过增加批量大小代替学习率衰减.

InfoVAE: Balancing Learning and Inference in Variational Autoencoders

InfoVAE:平衡变分自编码器的学习和推断过程.