건대다니는 컴공생
[CoIn] 논문 리뷰 | Gated Attention for Large Language Models: Non-linearity, Sparsity,and Attention-Sink-Free (Qiu et al., 2025)