Generative Pretraining from Pixels

iGPT:像素级的图像预训练模型.

Do We Need Zero Training Loss After Achieving Zero Training Error?

Flooding:避免训练损失为0.

REALM: Retrieval-Augmented Language Model Pre-Training

REALM:通过检索增强预训练语言模型.

OneNet: Towards End-to-End One-Stage Object Detection

OneNet:无需NMS的One-stage端到端目标检测方法.

Implicit Gradient Regularization

隐式梯度正则化.

Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour

大批量分布式训练的线性缩放规则和warmup.