Dynamic Task Prioritization for Multitask Learning

多任务学习中的动态任务优先级.

Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer

使用稀疏门控的混合专家系统构建超大规模神经网络.

Region-based Non-local Operation for Video Classification

为视频分类设计的基于区域的非局部网络.

Exploring Self-attention for Image Recognition

探索图像识别的自注意力机制.

Image Super-Resolution with Non-Local Sparse Attention

通过非局部稀疏注意力实现图像超分辨率.

Polarized Self-Attention: Towards High-quality Pixel-wise Regression

极化自注意力: 面向高质量像素级回归.