SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers

SegFormer:为语义分割设计的简单高效的Transformer模型.

TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation

TransUNet:用Transformer为医学图像分割构造强力编码器.

Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers

用Transformer从序列到序列的角度重新思考语义分割.

CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification

CrossViT:图像分类的交叉注意力多尺度视觉Transformer.

Do We Really Need Explicit Position Encodings for Vision Transformers?

视觉Transformer真的需要显式位置编码吗?

Visual Transformers: Token-based Image Representation and Processing for Computer Vision

VT:基于Token的图像表示和处理.