Simple Copy-Paste is a Strong Data Augmentation Method for Instance Segmentation

一种用于实例分割的复制粘贴数据增强方法.

Large Batch Optimization for Deep Learning: Training BERT in 76 minutes

LAMB:结合层级自适应学习率与Adam.

卷积神经网络的可视化

Visualization methods of Convolutional Neural Networks.

Large Batch Training of Convolutional Networks

LARS:层级自适应学习率缩放.

Lookahead Optimizer: k steps forward, 1 step back

Lookahead:快权重更新k次,慢权重更新1次.

On the Variance of the Adaptive Learning Rate and Beyond

Radam:修正Adam算法中自适应学习率的早期方差.