건대다니는 컴공생
[CoIn] 논문 리뷰 | GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints (Ainslie et al., 2023)