-
오픈 모델이 임계점을 돌파했다
Open Models have crossed a threshold
Open models like GLM-5 and MiniMax M2.7 now match closed frontier models on core agent tasks — file operations, tool use, and instruction following — at a fraction of the cost and …
-
Import AI 455: AI 시스템들이 자신들을 스스로 구축하기 시작할 것이다
Import AI 455: AI systems are about to start building themselves.
The first step towards recursive self improvement
-
AI와 함께 일하고 성과를 복합하는 방법
How to Work and Compound with AI
Context as infra, taste as config, verification for autonomy, scale via delegation, closing the loop.
-
Interrupt 2026: 엔터프라이즈 규모의 에이전트
Previewing Interrupt 2026: Agents at Enterprise Scale
This year, we're doing it again. Interrupt 2026 is May 13–14 at The Midway in San Francisco, and the lineup, the format, and the scale have all leveled up.
-
메모리 검색 개선: New Computer가 LangSmith로 50% 높은 회상률을 달성한 방법
Improving Memory Retrieval: How New Computer achieved 50% higher recall with LangSmith
New Computer used LangSmith to improve their memory retrieval system, achieving 50% higher recall by tracking regressions in comparison view and adjusting conversation prompts acco…
-
AI 에이전트란 무엇인가?
What is an AI agent?
Introducing a new series of musings on AI agents, called "In the Loop".
-
LLM 판사를 인간 선호도에 정렬하기
Aligning LLM-as-a-Judge with Human Preferences
Deep dive into self-improving evaluators in LangSmith, motivated by the rise of LLM-as-a-Judge evaluators plus research on few-shot learning and aligning human preferences.
-
LangSmith를 이용한 회귀 테스트
Regression Testing with LangSmith
Evaluate and iterate on LLM applications with confidence using LangSmith's regression testing. Compare experiments, track performance, and identify changes.
-
Azure 마켓플레이스에서 LangSmith이 거래 가능한 상품으로 출시 발표
Announcing LangSmith is now a transactable offering in the Azure Marketplace
LangSmith is now available in Azure Marketplace. Deploy the DevOps platform for LLM apps in your Azure VPC with full data control and MACC credit support.
-
평가 주도 개발을 통한 LLM 신뢰성의 반복적 향상
Iterating Towards LLM Reliability with Evaluation Driven Development
Dosu uses evaluation driven development and LangSmith to build reliable LLM products at scale, monitor production performance, and iterate with confidence.
-
테스트 실행 비교
Test Run Comparisons
Compare LLM test runs side-by-side with LangSmith's Test Run Comparisons. Manually inspect data, filter results, and gain insights faster.
-
에이전틱 엔지니어링: AI 에이전트 스웜이 소프트웨어 엔지니어링을 재정의하는 방법
Agentic Engineering: How Swarms of AI Agents Are Redefining Software Engineering
Multi-agent systems that mirror real engineering teams — not just code faster — can cut debug time by 93% and compress cross-team delivery. Here's the architecture built on LangGra…
-
에이전트 관찰성: 프로덕션 LLM 에이전트 모니터링 및 평가 방법
Agent Observability: How to Monitor and Evaluate LLM Agents in Production
Production monitoring for LLM agents requires new observability tools. Learn how to trace, evaluate, and improve AI agents at scale.
-
AI 공동 임상의: AI 강화 의료를 향한 경로 연구 — Google DeepMind
AI co-clinician: researching the path toward AI-augmented care â Google DeepMind
Researching the path to AI-augmented care and development of an AI co-clinician.
-
#4:There are no AI-native enterprises - by Ksenia Se
Why the enterprise AI problem is not a technology problem. In the series: The Org Age of AI
-
LLM 0.32a0 주요 하위 호환성 유지 리팩토링
LLM 0.32a0 is a major backwards-compatible refactor
<p>I just released <a href="https://llm.datasette.io/en/latest/changelog.html#a0-2026-04-28">LLM 0.32a0</a>, an alpha release of my <a href="https://llm.datasette.io/">LLM</a> Pyth…
-
그래나이트 4.1 LLM: 구축 방식
Granite 4.1 LLMs: How They’re Built
-
DeepInfra가 Hugging Face 추론 제공자로 이용 가능
DeepInfra on Hugging Face Inference Providers 🔥
-
NVIDIA Nemotron 3 Nano Omni 소개: 문서, 오디오, 비디오 에이전트를 위한 장문맥 멀티모달 지능
Introducing NVIDIA Nemotron 3 Nano Omni: Long-Context Multimodal Intelligence for Documents, Audio and Video Agents
-
이제 폐지된 OpenAI-마이크로소프트 AGI 조항의 역사 추적
Tracking the history of the now-deceased OpenAI Microsoft AGI clause
<p>For many years, Microsoft and OpenAI's relationship has included a weird clause saying that, should AGI be achieved, Microsoft's commercial IP rights to OpenAI's technology woul…
-
Google DeepMind와 한국, 과학적 발견 가속화 파트너십
Google DeepMind and Korea Partner to Accelerate Scientific Discovery â Google DeepMind
Google DeepMind and Korea partner to accelerate scientific breakthroughs using frontier AI models
-
OpenAI 프라이버시 필터를 사용하여 확장 가능한 웹 애플리케이션 구축하는 방법
How to build scalable web apps with OpenAI's Privacy Filter
-
딥씨크 V4 - 거의 최고 수준의 성능, 저렴한 가격
DeepSeek V4 - almost on the frontier, a fraction of the price
<p>Chinese AI lab DeepSeek's last model release was V3.2 (and V3.2 Speciale) <a href="https://simonwillison.net/2025/Dec/1/deepseek-v32/">last December</a>. They just dropped the f…
-
딥시크-V4: 에이전트가 실제로 사용할 수 있는 백만 토큰 컨텍스트
DeepSeek-V4: a million-token context that agents can actually use
-
웹 브라우저에서 LiteParse로 PDF 텍스트 추출하기
Extract PDF text in your browser with LiteParse for the web
<p>LlamaIndex have a most excellent open source project called <a href="https://github.com/run-llama/liteparse">LiteParse</a>, which provides a Node.js CLI tool for extracting text…
-
GPT-5.5를 위한 펠리칸: 반공식적 Codex 백도어 API
A pelican for GPT-5.5 via the semi-official Codex backdoor API
<p><a href="https://openai.com/index/introducing-gpt-5-5/">GPT-5.5 is out</a>. It's available in OpenAI Codex and is rolling out to paid ChatGPT subscribers. I've had some preview …
-
Chrome 확장 프로그램에서 Transformers.js를 사용하는 방법
How to Use Transformers.js in a Chrome Extension
-
AI 101: How Token Taxonomy Affects Your Bill - by Ksenia Se
From reasoning tokens to vision patches – your guide to the species that now shape AI cost, speed, and capability
-
분리형 DiLoCo: 복원력 있는 대규모 분산 AI 훈련 — Google DeepMind
Decoupled DiLoCo: Resilient, Distributed AI Training at Scale â Google DeepMind
-
Claude Code가 월 100달러가 될까요? 아마 아닐 겁니다 - 정말 혼란스럽네요
Is Claude Code going to cost $100/month? Probably not - it's all very confusing
<p>Anthropic today quietly (as in <em>silently</em>, no announcement anywhere at all) updated their <a href="https://claude.com/pricing">claude.com/pricing</a> page (but not their …