https://arxiv.org/abs/2409.04431 Theory, Analysis, and Best Practices for Sigmoid Self-AttentionAttention is a key part of the transformer architecture. It is a sequence-to-sequence mapping that transforms each sequence element into a weighted sum of values. The weights are typically obtained as the softmax of dot products between keys and queries. Rarxiv.org Abstract. 본 논문에서는 attention mechanis..
전체 글
https://arxiv.org/abs/2409.08475 RT-DETRv3: Real-time End-to-End Object Detection with Hierarchical Dense Positive SupervisionRT-DETR is the first real-time end-to-end transformer-based object detector. Its efficiency comes from the framework design and the Hungarian matching. However, compared to dense supervision detectors like the YOLO series, the Hungarian matching provides marxiv.org Abstra..
https://arxiv.org/abs/2407.11699 Relation DETR: Exploring Explicit Position Relation Prior for Object DetectionThis paper presents a general scheme for enhancing the convergence and performance of DETR (DEtection TRansformer). We investigate the slow convergence problem in transformers from a new perspective, suggesting that it arises from the self-attention that iarxiv.org Abstract. 본 논문은 DETR의..
https://arxiv.org/abs/2406.03459 LW-DETR: A Transformer Replacement to YOLO for Real-Time DetectionIn this paper, we present a light-weight detection transformer, LW-DETR, which outperforms YOLOs for real-time object detection. The architecture is a simple stack of a ViT encoder, a projector, and a shallow DETR decoder. Our approach leverages recent advarxiv.org Abstract. 본 논문에서는 실시간 object dete..
https://arxiv.org/abs/2405.14458 YOLOv10: Real-Time End-to-End Object DetectionOver the past years, YOLOs have emerged as the predominant paradigm in the field of real-time object detection owing to their effective balance between computational cost and detection performance. Researchers have explored the architectural designs, optimarxiv.org Abstract. YOLO는 연산 비용과 탐지 성능 사이의 효과적인 균형 덕분에 실시간 객체 탐..
https://arxiv.org/abs/2207.13085 Group DETR: Fast DETR Training with Group-Wise One-to-Many AssignmentDetection transformer (DETR) relies on one-to-one assignment, assigning one ground-truth object to one prediction, for end-to-end detection without NMS post-processing. It is known that one-to-many assignment, assigning one ground-truth object to multiplearxiv.org Abstract. DETR은 NMS 후처리가 없는 end..
https://arxiv.org/abs/2203.03605 DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object DetectionWe present DINO (\textbf{D}ETR with \textbf{I}mproved de\textbf{N}oising anch\textbf{O}r boxes), a state-of-the-art end-to-end object detector. % in this paper. DINO improves over previous DETR-like models in performance and efficiency by using a contrastiarxiv.org Introduction. Object..
https://arxiv.org/abs/2203.01305 DN-DETR: Accelerate DETR Training by Introducing Query DeNoisingWe present in this paper a novel denoising training method to speedup DETR (DEtection TRansformer) training and offer a deepened understanding of the slow convergence issue of DETR-like methods. We show that the slow convergence results from the instabilitarxiv.org Abstract. DETR의 학습 속도를 높이는 DeNoisin..