#open-weight-models
총 3건 · 1/1 페이지
-
LLM 아키텍처의 최근 발전: KV 공유, mHC, 그리고 압축된 어텐션
Recent Developments in LLM Architectures: KV Sharing, mHC, and Compressed Attention
From Gemma 4 to DeepSeek V4, How New Open-Weight LLMs Are Reducing Long-Context Costs
-
LLM 아키텍처를 이해하기 위한 내 워크플로우
My Workflow for Understanding LLM Architectures
A learning-oriented workflow for understanding new open-weight model releases
-
DeepSeek V3에서 V3.2로: 아키텍처, 희소 주의, 강화학습 업데이트
From DeepSeek V3 to V3.2: Architecture, Sparse Attention, and RL Updates
Understanding How DeepSeek's Flagship Open-Weight Models Evolved