Training data-efficient image transformers & distillation through attention

DeiT:通过注意力蒸馏训练数据高效的视觉Transformer.

Better plain ViT baselines for ImageNet-1k

在ImageNet-1k数据集上更好地训练视觉Transformer.

视觉Transformer(Vision Transformer)

Vision Transformer.

Scalable Diffusion Models with Transformers

使用Transformer实现可扩展的扩散模型.

Position Prediction as an Effective Pretraining Strategy

位置预测作为高效的预训练策略.

类别型特征提升(Categorical Boosting, CatBoost)

CatBoost: unbiased boosting with categorical features.