건대다니는 컴공생
[CoIn] 논문 리뷰 | Mixture-of-Depths: Dynamically allocating compute in transformer-based language models (Raposo et al., 2024)