LangChain Blog · 8시간 전 · 원문 보기

커스텀 에이전트 하네스를 만드는 방법

How to Build a Custom Agent Harness

오픈 소스

LangChain

에이전트 아키텍처

Deep Agents

커스텀 에이전트 하네스를 구축하는 방법

Sydney Runkle

2026년 6월 3일

분

블로그로 돌아가기

에이전트 생성

핵심 요약

하네스는 모델을 실제 세계에 연결하는 스캐폴딩입니다.
하네스가 주어진 작업에 얼마나 잘 맞는지에 따라 에이전트의 유용성이 결정됩니다.
LangChain의 create_agent는 특정 작업에 맞는 커스텀 하네스를 구축하는 가장 쉬운 방법입니다.

‍

유용한 에이전트를 구축하는 것은 주로 사용자화에 관한 것입니다: 에이전트를 주어진 작업에 적합한 올바른 컨텍스트, 데이터 및 환경에 연결하는 것입니다.

본질적으로 에이전트는 작업을 완료하고 결과를 반환할 때까지 루프에서 도구를 호출하는 모델입니다:

에이전트를 다음과 같이 정의할 수도 있습니다:

agent = model + harness

하네스는 모델을 실제 세계에 연결하는 모델 주위의 스캐폴딩입니다.

이 게시물의 나머지 부분은 다음을 가정합니다:

에이전트는 모델에 제공되는 컨텍스트만큼만 좋습니다
하네스의 역할은 모든 단계에서 모델에 컨텍스트를 제공하는 것입니다

따라서 유용한 에이전트를 구축하려면 주어진 작업에 대한 올바른 컨텍스트를 모델에 전달하는 데 뛰어난 하네스가 필요합니다.

기본 하네스

create_agent는 하네스를 구축하기 위한 LangChain의 기본 요소입니다. 모델, 도구 및 시스템 프롬프트를 전달하면 작동하는 에이전트를 갖게 됩니다:

from langchain.agents import create_agent

agent = create_agent(
    model="anthropic:claude-sonnet-4-6",
    tools=tools,
    system_prompt="you are a helpful assistant..."
)

Deep Agents 및 Claude Agent SDK와 같은 하네스는 의견이 있는 미들웨어(아래에서 설명) 스택: 메모리, 컨텍스트 관리, 샌드박싱 등으로 사전 조립됩니다. 이들은 프로덕션 준비가 된 에이전트에 빠르게 도달하도록 설계되었으며 대부분의 경우에 잘 작동합니다. 그러나 많은 에이전트는 이러한 하네스가 지원하는 것보다 더 세밀한 사용자화가 필요합니다: 커스텀 프롬프팅, 비즈니스 로직, 가드레일 등입니다.

create_agent는 다른 접근 방식을 취합니다: 그것은 의도적으로 최소화되었습니다. 우리의 철학은 Pi, 매우 구성 가능한 코딩 에이전트 하네스와 유사합니다. create_agent는 단지 핵심 에이전트 루프를 구현하며, 미들웨어를 사용자화의 기본 요소로 노출합니다.

미들웨어: 하네스를 사용자화하는 방법

미들웨어는 각 단계에서 에이전트 루프에 연결됩니다: 모델 호출 전후, 도구 호출 전후, 에이전트 시작 및 종료 시점에서입니다. 각 부분은 하나의 관심사를 처리하고 다른 부분과 자유롭게 구성됩니다:

미들웨어를 통해 자주 함께 작동하는 몇 가지 레버를 통해 에이전트에 기능을 추가할 수 있습니다:

결정적 로직. 비즈니스 로직, 정책 시행, 동적 에이전트 제어 — 루프의 특정 지점에서 실행되어야 하는 모든 것입니다. 여기에는 에이전트 자체에 대한 런타임 제어가 포함됩니다: 작업 복잡도에 따라 모델 교체, 프롬프트 조정, 에이전트의 메시지 히스토리 업데이트(예: 압축 중)입니다. 프롬프트에 들어가거나 들어가지 말아야 하는 모든 것을 위한 올바른 위치입니다.

도구. 도구를 에이전트에 직접 등록하는 대신, 미들웨어는 전체 생명주기(설정, 종료, 등록)를 처리할 수 있으며 에이전트에 깔끔한 도구 세트를 제공합니다. 이것은 도구가 종속성을 가지고 있거나 초기화가 필요하거나 실행 종료 시 깔끔하게 정리되어야 할 때 중요합니다. 또한 도구 구성을 이를 관리하는 로직 근처에 유지하여 에이전트 정의 전체에 분산되지 않습니다.

커스텀 상태. 미들웨어가 훅 전체에서 상태를 추적해야 하면 미들웨어는 에이전트의 상태를 커스텀 프로퍼티로 확장할 수 있습니다. 이것은 미들웨어가 실행 전체에서 상태를 추적할 수 있게 합니다(카운터, 플래그 또는 에이전트 실행 전체에서 지속되는 다른 값 유지) 그리고 훅 사이에서 데이터를 공유합니다.

스트림 핸들러. 미들웨어는 에이전트의 출력 스트림을 가로챌 수 있고 변환할 수 있습니다 — 이벤트 필터링, 메타데이터 주입, 다양한 이벤트 유형을 다양한 소비자로 라우팅합니다. 스택의 다양한 부분이 에이전트가 수행하는 다양한 작업에 반응해야 할 때 유용합니다: 토큰 델타를 사용하는 UI, 도구 호출을 캡처하는 감사 로그, 레이턴시를 추적하는 모니터링 시스템입니다.

미들웨어의 아름다움은 다음입니다:

에이전트 루프의 모든 지점에서 사용자화를 활성화합니다
관련 로직을 구성 가능하고 공유 가능한 코드 단위로 묶습니다

LangChain은 가장 일반적인 패턴에 대해 사전 구축된 미들웨어를 제공합니다. 당신의 사용 사례에 특유한 모든 것은 하나의 커스텀 미들웨어에서 떨어져 있습니다. 각 부분이 분리되어 있기 때문에 같은 미들웨어를 조직의 모든 에이전트 전체에서 재사용할 수 있으므로 새로운 에이전트는 이를 다시 구축하지 않고 테스트를 거친 동작을 상속합니다.

하네스 기능

하네스의 역할은 주어진 작업에 대해 모델이 올바른 시간에 올바른 컨텍스트를 얻게 하는 것입니다.

아래 표는 일반적인 기능을 이를 지원하는 미들웨어에 매핑합니다. 대부분의 프로덕션 에이전트는 에이전트의 필요에 따라(장시간 실행되는가? 작업이 얼마나 복잡한가? 에이전트 행동이 얼마나 민감한가? 등) 여러 개를 함께 사용하게 됩니다:

기능	중요한 이유	미들웨어
컨텍스트 오버플로우 방지	장시간 실행되는 세션은 메시지 히스토리가 빠르게 축적됩니다. 개입 없이 컨텍스트 윈도우를 오버플로우합니다.	SummarizationMiddleware, ContextEditingMiddleware
메모리 접근 및 업데이트	시작 시 관련 지식을 로드하고 실행 종료 시 다시 작성합니다. 에이전트가 실제 사용으로부터 시간이 지남에 따라 개선되게 합니다.	FilesystemMiddleware, MemoryMiddleware, SkillsMiddleware
환경에서 조치 취하기	고정된 도구 세트는 에이전트가 할 수 있는 것을 제한합니다. 파일 시스템 및 실행 환경에 대한 접근은 더 창의적인 솔루션을 잠금 해제하고 종종 더 큰 토큰 효율성을 갖춥니다.	ShellToolMiddleware, FilesystemMiddleware, CodeInterpreterMiddleware
작업 위임	하위 에이전트는 깔끔한 컨텍스트 윈도우로 복잡한 하위 작업을 처리합니다. 할일 목록은 장시간 실행되는 동안 진행 상황을 추적합니다.	SubAgentMiddleware, AsyncSubAgentMiddleware, TodoListMiddleware
일시적 실패 처리	모델과 도구는 예측 불가능하게 실패합니다. 프로덕션 에이전트는 백오프 재시도 로직과 모델을 사용할 수 없을 때의 폴백이 필요합니다.	ToolRetryMiddleware, ModelRetryMiddleware, ModelFallbackMiddleware
정책 시행	PII 처리, 규정 준수 확인, 승인 게이트 — 이들은 모델이 무엇을 하든지 관계없이 모든 호출에서 실행되어야 합니다. 그들은 프롬프트에 속하지 않습니다.	PIIMiddleware, HumanInTheLoopMiddleware
에이전트 조종	완전한 자율성이 항상 적절하지는 않습니다. 중요한 조치 전에 일시 중지하고 인간이 승인, 거부 또는 리다이렉트하기를 기다립니다.	HumanInTheLoopMiddleware
비용 제어	프롬프트 캐싱은 장시간 실행되는 작업에 대한 토큰 지출을 줄입니다. 호출 제한은 비용이 미확인 상태로 축적되는 것을 방지합니다.	ModelCallLimitMiddleware, ToolCallLimitMiddleware, PromptCachingMiddleware

사전 구축된 미들웨어의 전체 목록은 여기를 참조하세요.

작업-하네스 적합성

작업-하네스 적합성은 하네스가 작업의 실제 요구 사항에 얼마나 잘 일치하는지입니다: 필요한 컨텍스트, 직면할 실패, 시행해야 할 정책, 작동하는 환경입니다. 고객 서비스 에이전트용 하네스는 장시간 실행되는 코딩 에이전트용으로 구축된 하네스와 매우 다릅니다.

LangChain에서 구축하는 모든 에이전트(우리의 GTM 에이전트, 비동기 코딩 에이전트 및 우리의 코드 없는 에이전트 빌더 포함)는 해당 에이전트의 미션에 맞춘 미들웨어 스택으로 create_agent를 기반으로 구축됩니다.

최고의 에이전트는 단지 능력 있는 모델로 구축되지 않으며, 작업에 밀접하게 맞는 하네스로 구축됩니다. create_agent를 사용한 커스텀 하네스를 구축하는 가장 쉬운 방법입니다.

참고 자료

시작하기

감사의 말씀

@hwchase17, @huntlovell, @masondrxy 및 @Vtrivedy10의 사려 깊은 검토와 피드백에 감사합니다.

에이전트가 정말로 무엇을 하고 있는지 확인하세요

저희 에이전트 엔지니어링 플랫폼인 LangSmith는 개발자가 모든 에이전트 의사결정을 디버그하고, 변경 사항을 평가하며, 한 번의 클릭으로 배포할 수 있도록 도와줍니다.

LangSmith 사용해 보기

데모 받기

Open Source

LangChain

Agent Architecture

Deep Agents

How to Build a Custom Agent Harness

Sydney Runkle

June 3, 2026

min

Go back to blog

Create agents

Key Takeaways

A harness is the scaffolding around the model that connects it to the real world.
How well a harness fits the task at hand determines how useful an agent is.
LangChain's create_agent is the easiest way to build a custom harness tailored to a given task.

‍

Building useful agents is largely about customization: connecting your agent to the right context, data, and environment(s) for the task at hand.

At its core, an agent is a model calling tools in a loop until it completes a task and returns a result:

You can also define an agent as:

agent = model + harness

The harness is the scaffolding around the model that connects it to the real world.

The remainder of this post assumes the following:

An agent is only as good as the context provided to the model
The job of a harness is to provide context to the model at every step

So, to build a useful agent, you need a harness that’s great at delivering the right context for the given task to the model.

The base harness

create_agent is LangChain's primitive for building a harness. Pass in a model, tools, and a system prompt, and you have a working agent:

from langchain.agents import create_agent

agent = create_agent(
    model="anthropic:claude-sonnet-4-6",
    tools=tools,
    system_prompt="you are a helpful assistant..."
)

Harnesses like Deep Agents and the Claude Agent SDK come pre-assembled with an opinionated middleware (explained below) stack: memory, context management, sandboxing, and more. They're designed to get you to a production-ready agent fast, and they work well for most cases. But many agents need finer grained customization than these harnesses support: custom prompting, business logic, guardrails, etc.

create_agent takes a different approach: it’s purposefully minimalistic. Our philosophy is similar to that of Pi, a highly configurable coding agent harness. create_agent just implements the core agent loop, and it exposes middleware as a primitive for customization.

Middleware: how you customize the harness

Middleware hooks into the agent loop at each step: before and after model calls, before and after tool calls, at agent startup and teardown. Each piece handles one concern and composes freely with any other:

Middleware allows you to add capabilities to your agent via a few levers that often work together:

Deterministic Logic. Business logic, policy enforcement, dynamic agent control — anything that needs to fire at a specific point in the loop. This includes runtime control over the agent itself: swapping the model based on task complexity, adjusting the prompt, and updating the agent’s message history (during compaction, for example). The right place for anything that can't (or shouldn't) live in a prompt.

Tools. Rather than registering tools directly on the agent, middleware can handle the full lifecycle — setup, teardown, registration — and hand the agent a clean set of tools to work with. This matters when tools have dependencies, require initialization, or need to be torn down cleanly at the end of a run. It also keeps tool configuration close to the logic that governs it, rather than scattered across the agent definition.

Custom state. If your middleware needs to track state across hooks, middleware can extend the agent’s state with custom properties. This enables middleware to track state throughout execution (maintain counters, flags, or other values that persist throughout agent runs) and share data between hooks.

Stream handlers. Middleware can intercept and transform the agent's output stream — filtering events, injecting metadata, routing different event types to different consumers. Useful when different parts of your stack need to react to different things the agent does: a UI consuming token deltas, an audit log capturing tool calls, a monitoring system tracking latency.

The beauty of middleware is that it:

Enables customization at any point in the agent loop
Bundles related logic in composable, sharable units of code

LangChain ships prebuilt middleware for the most common patterns. Anything bespoke to your use case is one custom middleware away. Because each piece is isolated, the same middleware can be reused across every agent in an organization so that new agents inherit battle-tested behavior without rebuilding it.

Harness capabilities

The job of a harness is to get the model the right context at the right time for the given task.

The table below maps common capabilities to middleware that support them. Most production agents end up using several together, depending on the agent’s needs (is it long running? how complex are the tasks? how sensitive are the agent’s actions?, etc):

Capability	Why it Matters	Middleware
Prevent context overflow	Long-running sessions accumulate message history fast. Without intervention, it overflows the context window.	SummarizationMiddleware, ContextEditingMiddleware
Access and update memory	Load relevant knowledge at startup, write it back at the end of a run. Lets the agent improve over time from real usage.	FilesystemMiddleware, MemoryMiddleware, SkillsMiddleware
Take actions in an environment	A fixed toolset limits what an agent can do. Access to a filesystem and execution environment unlocks more creative solutions, often with greater token efficiency.	ShellToolMiddleware, FilesystemMiddleware, CodeInterpreterMiddleware
Delegate tasks	Subagents handle complex sub-tasks with clean context windows. A todo list tracks progress across a long run.	SubAgentMiddleware, AsyncSubAgentMiddleware, TodoListMiddleware
Handle transient failures	Models and tools fail unpredictably. Production agents need retry logic with backoff and fallbacks when a model is unavailable.	ToolRetryMiddleware, ModelRetryMiddleware, ModelFallbackMiddleware
Enforce policies	PII handling, compliance checks, approval gates — these need to fire on every call regardless of what the model does. They don't belong in a prompt.	PIIMiddleware, HumanInTheLoopMiddleware
Steer the agent	Full autonomy isn't always appropriate. Pause before consequential actions and wait for a human to approve, reject, or redirect.	HumanInTheLoopMiddleware
Control costs	Prompt caching reduces token spend on long-running tasks. Call limits prevent costs from accumulating unchecked.	ModelCallLimitMiddleware, ToolCallLimitMiddleware, PromptCachingMiddleware

See the full list of prebuilt middleware here.

Task-harness fit

Task-harness fit is how well your harness matches the actual demands of the task: the context it needs, the failures it'll encounter, the policies it must enforce, the environment it operates in. A harness for a customer service agent looks very different from one built for a long-running coding agent.

Every agent we build at LangChain, including our GTM agent, asynchronous coding agent, and our no-code agent builder, is built on create_agent with a middleware stack tailored to that agent’s mission.

The best agents aren't just built with capable models, they're built with harnesses that tightly fit the task. The easiest way to build a custom harness is with create_agent.

References

Get Started

Acknowledgements

Thanks to @hwchase17, @huntlovell, @masondrxy, and @Vtrivedy10 for their thoughtful review and feedback.

See what your agent is really doing

LangSmith, our agent engineering platform, helps developers debug every agent decision, eval changes, and deploy in one click.

Try LangSmith

Get a demo

#agent-harness #langchain #ai-agents #agent-middleware #agent-loop #custom-agents

커스텀 에이전트 하네스를 만드는 방법

커스텀 에이전트 하네스를 구축하는 방법

핵심 요약

기본 하네스

미들웨어: 하네스를 사용자화하는 방법

하네스 기능

작업-하네스 적합성

참고 자료

시작하기

감사의 말씀

루브릭 도입: 자신의 작업을 평가하고 수정하는 에이전트 구축

스킬과 인터프리터로 에이전트를 위한 워크플로우 구축

토큰 스트림에서 에이전트 스트림으로

에이전트가 정말로 무엇을 하고 있는지 확인하세요

How to Build a Custom Agent Harness

Key Takeaways

The base harness

Middleware: how you customize the harness

Harness capabilities

Task-harness fit

References

Get Started

Acknowledgements

Introducing Rubrics: Build Agents that Evaluate and Correct Their Work

Building workflows for agents with Skills and Interpreters

From Token Streams to Agent Streams

See what your agent is really doing

커스텀 에이전트 하네스를 만드는 방법

커스텀 에이전트 하네스를 구축하는 방법

핵심 요약

기본 하네스

미들웨어: 하네스를 사용자화하는 방법

하네스 기능

작업-하네스 적합성

참고 자료

시작하기

감사의 말씀

관련 콘텐츠

루브릭 도입: 자신의 작업을 평가하고 수정하는 에이전트 구축

스킬과 인터프리터로 에이전트를 위한 워크플로우 구축

토큰 스트림에서 에이전트 스트림으로

에이전트가 정말로 무엇을 하고 있는지 확인하세요

How to Build a Custom Agent Harness

Key Takeaways

The base harness

Middleware: how you customize the harness

Harness capabilities

Task-harness fit

References

Get Started

Acknowledgements

Related content

Introducing Rubrics: Build Agents that Evaluate and Correct Their Work

Building workflows for agents with Skills and Interpreters

From Token Streams to Agent Streams

See what your agent is really doing