Stochastic Gradient Methods with Layer-wise Adaptive Moments for Training of Deep Networks
NovoGrad:使用层级自适应二阶矩进行梯度归一化.
NovoGrad:使用层级自适应二阶矩进行梯度归一化.
一种用于实例分割的复制粘贴数据增强方法.
LAMB:结合层级自适应学习率与Adam.
Visualization methods of Convolutional Neural Networks.
LARS:层级自适应学习率缩放.
Lookahead:快权重更新k次,慢权重更新1次.