DEIM: DETR with Improved Matching for Fast Convergence

Link
Abstract

We introduce DEIM, an innovative and efficient training framework designed to accelerate convergence in real-time object detection with Transformer-based architectures (DETR). To mitigate the sparse supervision inherent in one-to-one (O2O) matching in DETR models, DEIM employs a Dense O2O matching strategy. This approach increases the number of positive samples per image by incorporating additional targets, using standard data augmentation techniques. While Dense O2O matching speeds up convergence, it also introduces numerous low-quality matches that could affect performance. To address this, we propose the Matchability-Aware Loss (MAL), a novel loss function that optimizes matches across various quality levels, enhancing the effectiveness of Dense O2O. Extensive experiments on the COCO dataset validate the efficacy of DEIM. When integrated with RT-DETR and D-FINE, it consistently boosts performance while reducing training time by 50%. Notably, paired with RT-DETRv2, DEIM achieves 53.2% AP in a single day of training on an NVIDIA 4090 GPU. Additionally, DEIM-trained real-time models outperform leading real-time object detectors, with DEIM-D-FINE-L and DEIM-D-FINE-X achieving 54.7% and 56.5% AP at 124 and 78 FPS on an NVIDIA T4 GPU, respectively, without the need for additional data. We believe DEIM sets a new baseline for advancements in real-time object detection. Our code and pre-trained models are available at https://github.com/ShihuaHuang95/DEIM.

Synth

Problem:: DETR의 One-to-One 매칭 방식의 희소한 Supervision과 Low-quality Match로 인해한 느린 수렴 속도와 성능 저하

Solution:: Dense O2O 매칭 전략/Matchability-Aware Loss(MAL)

Novelty:: 표준 데이터 증강 기법을 활용해 타겟 수를 늘리는 단순하고 효과적인 접근법/매칭 품질에 따라 손실값을 조정하는 새로운 손실 함수 설계

Note:: 추가 디코더나 구조 변경 없이 훈련 시간을 50% 단축하면서도 성능은 향상시키는 효율적인 방법

Summary

Motivation

VFL(p,q,y)={q(qlog(p)+(1q)log(1p))if q>0αpγlog(1p)if q=0
  1. Low-Quality Matches:
    • 낮은 IoU를 가진 박스의 경우 손실값이 매우 적음 → 저품질 박스의 예측 개선 X
  2. Negative Samples 처리:
    • 겹침이 전혀 없는 박스(q=0)는 무조건 Negative Sample → Negative Sample 수 증가 → Query수 자체가 적음 → Positive Sample 수 감소

Method

Dense O2O

file-20250321004800589.png

Matchability Aware Loss

file-20250321010703980.png|850

좀 더 직접적인 시각화
file-20250321011102425.png|500

Method 검증

주요 Insight