-
현대 LLM의 어텐션 변형 시각 가이드
A Visual Guide to Attention Variants in Modern LLMs
From MHA and GQA to MLA, sparse attention, and hybrid architectures
-
DeepSeek V3에서 V3.2로: 아키텍처, 희소 주의, 강화학습 업데이트
From DeepSeek V3 to V3.2: Architecture, Sparse Attention, and RL Updates
Understanding How DeepSeek's Flagship Open-Weight Models Evolved