AI News · #large-language-models

Sebastian Raschka · 2026-03-22 제목번역

현대 LLM의 어텐션 변형 시각 가이드

A Visual Guide to Attention Variants in Modern LLMs

From MHA and GQA to MLA, sparse attention, and hybrid architectures

#large-language-models #machine-learning #transformer-architecture #sparse-attention #attention-mechanisms #group-query-attention

Sebastian Raschka · 2025-12-30 제목번역

2025년 LLM의 현황: 진전, 문제, 그리고 예측

The State Of LLMs 2025: Progress, Problems, and Predictions

A 2025 review of large language models, from DeepSeek R1 and RLVR to inference-time scaling, benchmarks, architectures, and predictions for 2026.

#deepseek #large-language-models #ai-predictions #ai-benchmarks #inference-scaling #neural-architectures

Sebastian Raschka · 2025-12-30 제목번역

LLM 연구논문: 2025년 목록 (7월~12월)

LLM Research Papers: The 2025 List (July to December)

In June, I shared a bonus article with my curated and bookmarked research paper lists to the paid subscribers who make this Substack possible.

#generative-ai #large-language-models #machine-learning #nlp #ai-research #research-papers

Sebastian Raschka · 2025-11-04 제목번역

표준 LLM을 넘어서 - Sebastian Raschka 박사

Beyond Standard LLMs - by Sebastian Raschka, PhD

Linear Attention Hybrids, Text Diffusion, Code World Models, and Small Recursive Transformers

#large-language-models #transformer-architecture #linear-attention #text-diffusion #code-models #recursive-transformers

Sebastian Raschka · 2025-06-17 제목번역

LLM의 KV 캐시 이해와 처음부터 구현하기

Understanding and Coding the KV Cache in LLMs from Scratch

KV caches are one of the most critical techniques for efficient inference in LLMs in production.

#large-language-models #kv-cache #llm-inference #efficient-inference #transformer-optimization #neural-network-caching

Sebastian Raschka · 2025-05-10 제목번역

바닥부터 배우는 LLM 코딩: 완전한 강의

Coding LLMs from the Ground Up: A Complete Course

Why build LLMs from scratch? It's probably the best and most efficient way to learn how LLMs really work. Plus, many readers have told me they had a lot of fun doing it.

#generative-ai #large-language-models #machine-learning #deep-learning #llm-development #ai-fundamentals

Sebastian Raschka · 2025-03-29 제목번역

처음부터 시작하는 추론: 1장

First Look at Reasoning From Scratch: Chapter 1

Welcome to the next stage of large language models (LLMs): reasoning. LLMs have transformed how we process and generate text, but their success has been largely driven by statistic…

#large-language-models #natural-language-processing #reasoning #pattern-recognition #ai-advances #multi-step-reasoning