DETRs with Collaborative Hybrid Assignments Training

Link
Abstract

In this paper, we provide the observation that too few queries assigned as positive samples in DETR with one-to-one set matching leads to sparse supervision on the encoder's output which considerably hurt the discriminative feature learning of the encoder and vice visa for attention learning in the decoder. To alleviate this, we present a novel collaborative hybrid assignments training scheme, namely \mathcalCo - \textDETR, to learn more efficient and effective DETR-based detectors from versatile label assignment manners. This new training scheme can easily enhance the encoder's learning ability in end-to-end detectors by training the multiple parallel auxiliary heads supervised by one-to-many label assignments such as ATSS and Faster RCNN. In addition, we conduct extra customized positive queries by extracting the positive coordinates from these auxiliary heads to improve the training efficiency of positive samples in the decoder. In inference, these auxiliary heads are discarded and thus our method introduces no additional parameters and computational cost to the original detector while requiring no hand-crafted non-maximum suppression (NMS). We conduct extensive experiments to evaluate the effectiveness of the proposed approach on DETR variants, including DAB-DETR, Deformable-DETR, and DINO-Deformable-DETR. The state-of-the-art DINO-Deformable-DETR with Swin-L can be improved from 58.5% to 59.5% AP on COCO val. Surprisingly, incorporated with ViT-L backbone, we achieve 66.0% AP on COCO test-dev and 67.9% AP on LVIS val, outperforming previous methods by clear margins with much fewer model sizes. Codes are available at https://github.com/Sense-X/Co-DETR.

Synth

Problem:: O2O Matching은 Encoder 출력에 Feature 품질 저하/O2O Matching은 Decoder의 Attention Learning을 저하

Solution:: O2M Matching(ATSS, Faster RCNN 등)을 사용하는 Aux Head로 Encoder 학습/Aux Head의 예측 Coordinate에서 생성한 Customized Query로 Decoder 학습 (여러 O2O를 이용한 O2M 근사)

Novelty:: Discriminability Score 도입으로 Encoder 및 Decoder 성능 저하 및 제안 방식의 효과 검증/비슷한 다른 연구(Group/H Detr)과 달리 O2M에서 사용되는 Head를 사용해 O2M의 장점을 더 잘 적용했다고 주장/많은 Aux Head는 오히려 성능에 악영향임을 보임

Note::

Summary

Motivation

Method

file-20250324142017718.png

Collaborative Hybrid Assignments Training

Customized Positive Queries Generation

Method 검증