Open-vocabulary Object Detection via Vision and Language Knowledge Distillation

通过视觉和语言知识蒸馏实现开放词汇目标检测.

Open-Vocabulary Object Detection Using Captions

使用描述进行开集目标检测.

MDETR -- Modulated Detection for End-to-End Multi-Modal Understanding

MDETR:用于端到端多模态理解的调制检测.

Towards Open-Set Object Detection and Discovery

面向开集目标检测与挖掘.

Grounded Language-Image Pre-training

对齐语言-图像预训练.

Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection

Grounding DINO:结合DINO与GLIP用于开集目标检测.