LangChain Labs 소개

오늘 우리는 지속적인 학습에 초점을 맞춘 새로운 응용 연구 노력인 LangChain Labs를 출시하고 있습니다. 우리의 목표는 모든 에이전트를 위한 개방적이고 응용된 연구를 발전시키는 것입니다. 우리는 이 기술이 더 넓은 에이전트 구축 커뮤니티에 유용하도록 다양한 산업의 파트너들과 협력하고 있습니다.

모든 에이전트 실행은 유용한 신호를 포함합니다. 열린 문제는 그 신호를 포착하고, 사용 가능한 데이터로 변환하고, 그 개선사항을 적용하는 방법입니다.

이 데이터의 포착, 변환, 저장은 정확히 LangSmith가 구축된 목적이며, 우리는 이것이 우리와 우리 고객에게 지속적 학습을 파악하는 데 선제적 이점을 제공한다고 생각합니다.

이러한 변경사항은 에이전트 하네스 최적화, 다양한 모델 선택, 또는 모델 미세 조정 등 Agent 스택의 여러 계층에서 적용될 수 있습니다.

우리는 Harvey, Nvidia, Prime Intellect, Fireworks, 그리고 Baseten을 포함한 초기 연구 파트너들과 함께 이 작업을 시작하고 있습니다.

가장 복잡한 법률 업무를 위한 효율적이고 자가 개선되는 에이전트에 대한 응용 연구를 추진하기 위해 LangChain Labs 팀과 협력할 수 있어 기쁩니다.

‍— Niko Grupen, Head of Applied Research, Harvey

우리가 추진 중인 초기 연구 방향은 다음과 같습니다:

대규모 에이전트 데이터에서 정보를 추출하여 에이전트 개선: 에이전트는 빠른 속도로 소프트웨어 시스템에 통합되고 있습니다. 곧 에이전트는 인류가 역사 전체에 걸쳐 생성한 것보다 더 많은 데이터를 몇 개월 안에 생성하게 될 것입니다. 그 데이터에서 평가/환경 생성, 하네스 엔지니어링, 사후 학습을 위한 유용한 신호를 추출하는 것은 여전히 어려운 문제입니다. Trace는 이 데이터의 원천이며, 우리는 모든 팀이 trace를 활용하여 더 나은 에이전트를 구축하도록 돕고 싶습니다.

Pareto 경계면의 효율적인 에이전트: 에이전트는 비용, 지연 시간, 작업 성능 관련 실제 조직적 제약 조건 하에서 작동합니다. 세계의 많은 중요한 작업들의 경우, 우리는 아직 에이전트가 자가 개선할 수 있도록 하는 모델 하네스, 모델, 피드백 루프의 최적 조합을 발견하지 못했습니다.

평가 및 시뮬레이션 환경의 체계적 구축: 에이전트를 적절히 평가하려면 프로덕션 환경에서의 사용 방식을 대표하는 환경에서 종단 간 방식으로 실행해야 하는 경우가 많습니다. 이러한 환경을 구축하는 것은 어렵고 시간이 오래 걸릴 수 있습니다. 우리는 평가, 시뮬레이션, 강화 학습을 위한 환경을 더 쉽게 만들고 실행할 수 있는 방법을 연구하고 있습니다.

프롬프트 최적화: 프롬프트는 특정 모델 계열에 맞춰져 있으며, 한 모델 계열에서 다른 계열로 마이그레이션하는 것은 번거롭고 시간이 많이 걸릴 수 있습니다.

우리는 팀이 작업에 맞는 올바른 모델을 쉽게 선택할 수 있는 다중 모델 미래를 지향합니다. 모델 전반에 걸친 프롬프트 최적화는 이러한 마이그레이션을 더 용이하게 하고 필요한 수동 조정을 줄일 수 있습니다.

파트너들과의 초기 작업에는 다양한 수직 도메인(법률 서비스 등) 간 에이전트 일반화 방식 측정; Nemotron과 같은 오픈 모델을 비용 효율적인 서브 에이전트로 미세 조정하는 하네스 엔지니어링; 그리고 팀이 trace 데이터를 에이전트 개선을 위한 사용 가능한 데이터로 전환할 수 있도록 eval/환경을 구축하는 것이 포함됩니다.

우리의 오픈소스 생태계는 항상 빌더들이 서로로부터 배우는 방식의 핵심이었으며, LangChain Labs가 이 패턴을 계속하길 원합니다. 우리는 계속해서 더 넓은 에이전트 구축 커뮤니티를 돕는 연구, eval, 오픈소스 통합을 발표할 것입니다.

우리는 에이전트가 어떻게 배우고, 적응하고, 개선되는지 탐색하려는 팀들과 협력하고 싶습니다. 우리의 목표는 차세대 자가 개선 에이전트를 지원하는 더 많은 개방 연구를 발전시키는 것입니다.

우리가 배운 것들을 공유하고 커뮤니티와 함께 이를 계속 구축해 나갈 수 있어 기쁩니다.

Today we’re launching LangChain Labs, a new applied research effort focused on continual learning. Our goal is to advance open, applied research for every agent. We’re working with partners across industries to make sure this technology is useful for the broader agent-building community.

Every agent run contains useful signal. The open problem is how to capture that signal, transform it into usable data, and then applying those improvements.

This capturing, transforming, and storing of data is exactly what LangSmith was built for, which we believe provides us and our customers a head start in figuring out continual learning.

These changes can be applied at different layers of the Agent stack such as the optimizing the agent harness, choosing different models, or fine-tuning models.

We’re starting this work with a few early research partners including Harvey, Nvidia, Prime Intellect, Fireworks, and Baseten.

We’re excited to work with the LangChain Labs team to push applied research on efficient, self-improving agents for the most complex legal work.

‍— Niko Grupen, Head of Applied Research, Harvey

The early research directions we’re tackling are:

Improving Agents by Mining Information from Large-Scale Agent Data: Agents are being integrated into software systems at a rapid rate. Very soon agents will produce more data in months than humans have ever produced in aggregate. Extracting useful signals from that data for eval/environment generation, harness engineering, and post-training is still a difficult problem. Traces are the source of that data and we want to help every team use traces to build better agents.

Efficient Agents at the Pareto Frontier: Agents operate under real organizational constraints around cost, latency, and task performance. For many of the world’s most important tasks, we’re yet to discover the most efficient combination of models harnesses, models, and feedback loops that allow agents to self-improve.

Systematic building of evaluation and simulation environments: To properly evaluate agents, you often need to run them in an end-to-end manner in an environment representative of how they will be used in production. These environments can be difficult and time consuming to create. We’re researching ways to make it easier to create and run environments for evaluation, simulation, and reinforcement learning.

Prompt Optimization: Prompts are specific to model families, and it can be annoying and time consuming to migrate from one model family to the next.

We believe in a multi-model future where teams can choose the right model for the task easily. Prompt optimization across models can help make those migrations easier and reduce the amount of manual tuning required.

Some early work with our partners includes measuring how agents generalize between different vertical domains (like legal services); harness engineering & fine-tuning open models like Nemotron as cost-efficient subagents; and building evals/environments so teams can turn their trace data into usable data to improve agents.

Our open-source ecosystem has always been a core part of how builders learn from each other, and we want LangChain Labs to continue that pattern. We’ll continue publishing research, evals, and open-source integrations that help the broader agent-building community.

We want to partner with teams looking to explore how agents learn, adapt, and improve. Our goal is to advance more open research powering the next generation of self-improving agents.

We’re excited to share what we learn and keep building this with the community.