ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks

ViLBERT:用于视觉和语言任务的无任务特定的视觉语言表示的预训练.

VisualBERT: A Simple and Performant Baseline for Vision and Language

VisualBERT:一个简单有效的视觉语言基线.

LXMERT: Learning Cross-Modality Encoder Representations from Transformers

LXMERT:学习Transformer中的跨模态编码表示.

视觉-语言预训练(Vision-Language Pretraining)

Vision-Language Pretraining.

Analyzing and Improving the Training Dynamics of Diffusion Models

分析和改进扩散模型的训练动力学.

(黑龙江篇)哈尔滨:冰城雪砌琼楼景,松水波摇尔滨情

(Heilongjiang Chapter) Harbin: Ice and Snow World.