-
엔비디아 코스모스 3, 네모트론 3 울트라, RTX 스파크
[AINews] NVIDIA Cosmos 3, Nemotron 3 Ultra, and RTX Spark
Jensen scores a huge win.
-
LLM을 넘어서: 확장 가능한 엔터프라이즈 AI 도입이 에이전트 로직에 의존하는 이유
Beyond LLMs: Why Scalable Enterprise AI Adoption Depends on Agent Logic
-
MiniMax M3: 백만 토큰 컨텍스트를 갖춘 오픈 가중치 모델이 독점 리더들에 도전하다
MiniMax M3: Open-weight model with a million-token context challenges proprietary leaders
<p><img alt="Stylized MiniMax audio wave logo with magenta-to-orange gradient against a dark background with silhouette" class="attachment-full size-full wp-post-image" height="768…
-
제미니 옴니와 제미니 3.5 플래시의 구글 영상 9개
Watch 9 Google videos of Gemini Omni and Gemini 3.5 Flash
Gemini Omni & Gemini 3.5 hero
-
[AINews] Anthropic이 Series H에서 9650억 달러 모금, Opus 4.8 및 Dynamic Workflows/ultracode 출시
[AINews] Anthropic raises $965B Series H, releases Opus 4.8 and Dynamic Workflows/ultracode
Total Anthropic victory!
-
허브 버킷을 이용한 1조 파라미터 배포: TRL의 델타 가중치 동기화
Shipping a Trillion Parameters With a Hub Bucket: Delta Weight Sync in TRL
-
Gemini 3.5 Flash: 더 비싸지만 구글이 모든 곳에 사용할 계획
Gemini 3.5 Flash: more expensive, but Google plan to use it for everything
<p>Today at Google I/O, Google <a href="https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-5/">released Gemini 3.5 Flash</a>. This one skipped the <co…
-
지난 6개월 LLM의 모든 것을 5분으로
The last six months in LLMs in five minutes
<p>I put together these annotated slides from my five minute lightning talk at PyCon US 2026, using the <a href="https://tools.simonwillison.net/annotated-presentations">latest ite…
-
Gemini Omni 소개
Introducing Gemini Omni
-
Databricks, 엔터프라이즈 에이전트 워크플로우에 GPT-5.5 도입
Databricks brings GPT-5.5 to enterprise agent workflows | OpenAI
Databricks uses GPT-5.5 for enterprise agent workflows after the model set a new state of the art on the OfficeQA Pro benchmark.
-
vLLM V0에서 V1로: 강화학습에서 수정보다 정확성을 먼저
vLLM V0 to V1: Correctness Before Corrections in RL
-
그래나이트 4.1 LLM: 구축 방식
Granite 4.1 LLMs: How They’re Built
-
딥시크-V4: 에이전트가 실제로 사용할 수 있는 백만 토큰 컨텍스트
DeepSeek-V4: a million-token context that agents can actually use
-
Gemma 4: 바이트 대 바이트, 가장 강력한 오픈 모델
Gemma 4: Byte for byte, the most capable open models
Gemma 4: Our most intelligent open models to date, purpose-built for advanced reasoning and agentic workflows.
-
Gemma 4 환영합니다: 기기 내 최첨단 멀티모달 지능
Welcome Gemma 4: Frontier multimodal intelligence on device
-
현대 LLM의 어텐션 변형 시각 가이드
A Visual Guide to Attention Variants in Modern LLMs
From MHA and GQA to MLA, sparse attention, and hybrid architectures
-
ImportAI 449: LLM이 다른 LLM을 학습시킴; 72B 분산 학습 실행; 컴퓨터 비전은 생성 텍스트보다 더 어렵다
ImportAI 449: LLMs training other LLMs; 72B distributed training run; computer vision is harder than generative text
Will AI cause a political interregnum
-
Gemini 3.1 Pro: 가장 복잡한 작업을 위한 더 똑똑한 모델
Gemini 3.1 Pro: A smarter model for your most complex tasks
3.1 Pro is designed for tasks where a simple answer isn’t enough.
-
2025년 LLM의 현황: 진전, 문제, 그리고 예측
The State Of LLMs 2025: Progress, Problems, and Predictions
A 2025 review of large language models, from DeepSeek R1 and RLVR to inference-time scaling, benchmarks, architectures, and predictions for 2026.
-
LLM 연구논문: 2025년 목록 (7월~12월)
LLM Research Papers: The 2025 List (July to December)
In June, I shared a bonus article with my curated and bookmarked research paper lists to the paid subscribers who make this Substack possible.
-
표준 LLM을 넘어서 - Sebastian Raschka 박사
Beyond Standard LLMs - by Sebastian Raschka, PhD
Linear Attention Hybrids, Text Diffusion, Code World Models, and Small Recursive Transformers
-
LLM의 KV 캐시 이해와 처음부터 구현하기
Understanding and Coding the KV Cache in LLMs from Scratch
KV caches are one of the most critical techniques for efficient inference in LLMs in production.
-
바닥부터 배우는 LLM 코딩: 완전한 강의
Coding LLMs from the Ground Up: A Complete Course
Why build LLMs from scratch? It's probably the best and most efficient way to learn how LLMs really work. Plus, many readers have told me they had a lot of fun doing it.
-
처음부터 시작하는 추론: 1장
First Look at Reasoning From Scratch: Chapter 1
Welcome to the next stage of large language models (LLMs): reasoning. LLMs have transformed how we process and generate text, but their success has been largely driven by statistic…
-
NVIDIA GTC 2025 - LLM 기반 애플리케이션 구축
NVIDIA GTC 2025 - Building LLM-Powered Applications
Chip Huyen and I share what we've learned, best practices, and insights at NVIDIA GTC 2025.
-
Netflix PRS 2024 - 추천 경험에 LLM 적용
Netflix PRS 2024 - Applying LLMs to Recommendation Experiences
Challenges and lessons from deploying LLM experiences: evals, scalability, guardrails.
-
효과적인 AI 에이전트 구축
Building Effective AI Agents \ Anthropic
-
Claude 3.5 Sonnet으로 SWE-bench Verified의 기준을 높이다
Raising the bar on SWE-bench Verified with Claude 3.5 Sonnet Jan 06, 2025