Group DETR: Fast DETR Training with Group-Wise One-to-Many Assignment

Link
Abstract

Detection transformer (DETR) relies on one-to-one assignment, assigning one ground-truth object to one prediction, for end-to-end detection without NMS post-processing. It is known that one-to-many assignment, assigning one ground-truth object to multiple predictions, succeeds in detection methods such as Faster R-CNN and FCOS. While the naive one-to-many assignment does not work for DETR, and it remains challenging to apply one-to-many assignment for DETR training. In this paper, we introduce Group DETR, a simple yet efficient DETR training approach that introduces a group-wise way for one-to-many assignment. This approach involves using multiple groups of object queries, conducting one-to-one assignment within each group, and performing decoder self-attention separately. It resembles data augmentation with automatically-learned object query augmentation. It is also equivalent to simultaneously training parameter-sharing networks of the same architecture, introducing more supervision and thus improving DETR training. The inference process is the same as DETR trained normally and only needs one group of queries without any architecture modification. Group DETR is versatile and is applicable to various DETR variants. The experiments show that Group DETR significantly speeds up the training convergence and improves the performance of various DETR-based models. Code will be available at \url{https://github.com/Atten4Vis/GroupDETR}.

Synth

Problem:: DETR 모델들의 느린 학습 수렴 속도/One-To-Many 방식은 DETR에 적용시키기 어려움

Solution:: Group-wise One-To-Many Assignment 방식의 Group DETR 제안

Novelty:: 분리된 Self-Attention과 Parameter-Sharing Decoder 학습

Note:: 쉽고 단순한 아이디어, 다양한 방법에 손 쉽게 적용 가능. Parameter-Shared Decoder 방식이 처음 도입되었는지는 확인 필요

Summary

Motivation

Method

file-20250324004233016.png

(a): Group DETR, (b) Group One-to-Many, (c) Naive One-to-Many

Why it is Effective?

Method 검증

DN-DETR과의 차이점