-
vLLM V0에서 V1로: 강화학습에서 수정보다 정확성을 먼저
vLLM V0 to V1: Correctness Before Corrections in RL
-
검색에서 바로 해볼 수 있는 5가지 가드닝 팁
5 gardening tips you can try right in Search
An abstract background featuring soft, stippled illustrations of flowers and a butterfly in a bright palette of blue, green, and red. In the center of the image is a white circle c…
-
라이브 블로그: Claude와 함께하는 코딩 2026
Live blog: Code w/ Claude 2026
<p>I'm at Anthropic's Code w/ Claude event today. Here's my live blog of the morning keynote sessions.</p><p><em>You are only seeing the long-form articles from my blog. Subscribe …
-
바이브 코딩과 에이전트 엔지니어링이 생각보다 빠르게 수렴하고 있다
Vibe coding and agentic engineering are getting closer than I'd like
<p>I recently talked with Joseph Ruscio about AI coding tools for Heavybit's High Leverage podcast: <a href="https://www.heavybit.com/library/podcasts/high-leverage/ep-9-the-ai-cod…
-
AlphaEvolve: Gemini 기반 코딩 에이전트로 다양한 분야의 영향력 확장 – Google DeepMind
AlphaEvolve: Gemini-powered coding agent scaling impact across fields â Google DeepMind
Explore how AlphaEvolve's Gemini-powered algorithms are driving impact across business, infrastructure, and science.
-
에이전트 관찰성: 학습을 강화하기 위한 피드백의 필요성
https://www.langchain.com/blog/agent-observability-needs-feedback-to-power-learning
-
오픈 ASR 리더보드에 벤치마킹 조작 방지 기능 추가
Adding Benchmaxxer Repellant to the Open ASR Leaderboard
-
구글이 XPRIZE, Range Media Partners와 함께 350만 달러 규모의 'Future Vision' 영화 경진대회 개최
Google is partnering with XPRIZE and Range Media Partners on the $3.5 million Future Vision film competition.
<img src="https://storage.googleapis.com/gweb-uniblog-publish-prod/images/futurevisionxprize_social.max-600x600.format-webp.webp" />Google is partnering with XPRIZE and Range Media…
-
Open SWE: 내부 코딩 에이전트를 위한 오픈소스 프레임워크
Open SWE: An Open-Source Framework for Internal Coding Agents
Built on Deep Agents and LangGraph, Open SWE provides the core architectural components for internal coding agents.
-
오픈 모델이 임계점을 돌파했다
Open Models have crossed a threshold
Open models like GLM-5 and MiniMax M2.7 now match closed frontier models on core agent tasks — file operations, tool use, and instruction following — at a fraction of the cost and …
-
Import AI 455: AI 시스템들이 자신들을 스스로 구축하기 시작할 것이다
Import AI 455: AI systems are about to start building themselves.
The first step towards recursive self improvement
-
AI와 함께 일하고 성과를 복합하는 방법
How to Work and Compound with AI
Context as infra, taste as config, verification for autonomy, scale via delegation, closing the loop.
-
Interrupt 2026: 엔터프라이즈 규모의 에이전트
Previewing Interrupt 2026: Agents at Enterprise Scale
This year, we're doing it again. Interrupt 2026 is May 13–14 at The Midway in San Francisco, and the lineup, the format, and the scale have all leveled up.
-
메모리 검색 개선: New Computer가 LangSmith로 50% 높은 회상률을 달성한 방법
Improving Memory Retrieval: How New Computer achieved 50% higher recall with LangSmith
New Computer used LangSmith to improve their memory retrieval system, achieving 50% higher recall by tracking regressions in comparison view and adjusting conversation prompts acco…
-
AI 에이전트란 무엇인가?
What is an AI agent?
Introducing a new series of musings on AI agents, called "In the Loop".
-
LLM 판사를 인간 선호도에 정렬하기
Aligning LLM-as-a-Judge with Human Preferences
Deep dive into self-improving evaluators in LangSmith, motivated by the rise of LLM-as-a-Judge evaluators plus research on few-shot learning and aligning human preferences.
-
LangSmith를 이용한 회귀 테스트
Regression Testing with LangSmith
Evaluate and iterate on LLM applications with confidence using LangSmith's regression testing. Compare experiments, track performance, and identify changes.
-
Azure 마켓플레이스에서 LangSmith이 거래 가능한 상품으로 출시 발표
Announcing LangSmith is now a transactable offering in the Azure Marketplace
LangSmith is now available in Azure Marketplace. Deploy the DevOps platform for LLM apps in your Azure VPC with full data control and MACC credit support.
-
평가 주도 개발을 통한 LLM 신뢰성의 반복적 향상
Iterating Towards LLM Reliability with Evaluation Driven Development
Dosu uses evaluation driven development and LangSmith to build reliable LLM products at scale, monitor production performance, and iterate with confidence.
-
테스트 실행 비교
Test Run Comparisons
Compare LLM test runs side-by-side with LangSmith's Test Run Comparisons. Manually inspect data, filter results, and gain insights faster.
-
에이전틱 엔지니어링: AI 에이전트 스웜이 소프트웨어 엔지니어링을 재정의하는 방법
Agentic Engineering: How Swarms of AI Agents Are Redefining Software Engineering
Multi-agent systems that mirror real engineering teams — not just code faster — can cut debug time by 93% and compress cross-team delivery. Here's the architecture built on LangGra…
-
에이전트 관찰성: 프로덕션 LLM 에이전트 모니터링 및 평가 방법
Agent Observability: How to Monitor and Evaluate LLM Agents in Production
Production monitoring for LLM agents requires new observability tools. Learn how to trace, evaluate, and improve AI agents at scale.
-
AI 공동 임상의: AI 강화 의료를 향한 경로 연구 — Google DeepMind
AI co-clinician: researching the path toward AI-augmented care â Google DeepMind
Researching the path to AI-augmented care and development of an AI co-clinician.
-
LLM 0.32a0 주요 하위 호환성 유지 리팩토링
LLM 0.32a0 is a major backwards-compatible refactor
<p>I just released <a href="https://llm.datasette.io/en/latest/changelog.html#a0-2026-04-28">LLM 0.32a0</a>, an alpha release of my <a href="https://llm.datasette.io/">LLM</a> Pyth…
-
그래나이트 4.1 LLM: 구축 방식
Granite 4.1 LLMs: How They’re Built
-
DeepInfra가 Hugging Face 추론 제공자로 이용 가능
DeepInfra on Hugging Face Inference Providers 🔥
-
NVIDIA Nemotron 3 Nano Omni 소개: 문서, 오디오, 비디오 에이전트를 위한 장문맥 멀티모달 지능
Introducing NVIDIA Nemotron 3 Nano Omni: Long-Context Multimodal Intelligence for Documents, Audio and Video Agents
-
이제 폐지된 OpenAI-마이크로소프트 AGI 조항의 역사 추적
Tracking the history of the now-deceased OpenAI Microsoft AGI clause
<p>For many years, Microsoft and OpenAI's relationship has included a weird clause saying that, should AGI be achieved, Microsoft's commercial IP rights to OpenAI's technology woul…
-
Google DeepMind와 한국, 과학적 발견 가속화 파트너십
Google DeepMind and Korea Partner to Accelerate Scientific Discovery â Google DeepMind
Google DeepMind and Korea partner to accelerate scientific breakthroughs using frontier AI models
-
OpenAI 프라이버시 필터를 사용하여 확장 가능한 웹 애플리케이션 구축하는 방법
How to build scalable web apps with OpenAI's Privacy Filter