#transformer-architecture
총 3건 · 1/1 페이지
-
현대 LLM의 어텐션 변형 시각 가이드
A Visual Guide to Attention Variants in Modern LLMs
From MHA and GQA to MLA, sparse attention, and hybrid architectures
-
표준 LLM을 넘어서 - Sebastian Raschka 박사
Beyond Standard LLMs - by Sebastian Raschka, PhD
Linear Attention Hybrids, Text Diffusion, Code World Models, and Small Recursive Transformers
-
GPT-2에서 gpt-oss로: 아키텍처 발전 분석
From GPT-2 to gpt-oss: Analyzing the Architectural Advances
And How They Stack Up Against Qwen3