Why LangChain Agents Need a Learning Layer

LangChain is the most widely adopted framework for building AI agents. With 100k+ GitHub stars and a massive ecosystem of integrations, it powers everything from customer support bots to multi-step research agents to autonomous coding assistants.

But LangChain gives you the building blocks to create agents. It doesn't give you a mechanism for agents to learn from their mistakes.

Every LangChain agent starts each run from scratch. It has:

A system prompt you manually wrote and refined
Tools it can call (search, databases, APIs, code execution)
A chain or graph defining its reasoning flow (sequential chains, ReAct, LangGraph state machines)
Memory for the current conversation (buffer, summary, or vector store)

What it doesn't have is procedural memory — knowledge of how to succeed derived from past executions. When your LangChain agent misuses a tool, picks the wrong chain branch, or produces a poor output, it will make the same mistake next time because nothing in the system captures and applies that lesson.

The Gap LangSmith Doesn't Fill

Most LangChain teams use LangSmith for observability. LangSmith is excellent at what it does:

Trace logging — full visibility into chain execution, token usage, latency
Evaluation — scoring runs against datasets to measure quality
Monitoring — dashboards for production performance, error rates, cost tracking
Debugging — step-by-step inspection of what happened inside a run

LangSmith tells you what happened. It shows you where your agent went wrong, which tool call failed, which chain step produced bad output.

But it stops there. LangSmith doesn't:

Automatically extract reusable lessons from failure patterns
Generate improved prompts based on what it observed
Build a cumulative knowledge base of agent-specific skills
Close the loop between observing failures and preventing them

This is the gap between observability and learning. LangSmith gives you the data. Kayba turns that data into improvement.

How Kayba Works with LangChain

Kayba operates as an offline learning layer. It never touches your LangChain agent at runtime — no middleware, no callbacks, no monkey-patching. Instead, it analyzes your agent's traces after the fact and produces better prompts for future runs.

The Pipeline

Export traces: Collect conversation logs from your LangChain agent. These can come from LangSmith exports, custom logging callbacks, or any format that captures the agent's inputs, tool calls, chain outputs, and final responses.
Recursive analysis: Kayba's Recursive Reflector analyzes the traces using REPL-based code execution. It doesn't just summarize — it programmatically explores the trace data, identifying patterns across runs that surface-level review misses. For LangChain agents, this means correlating tool usage patterns, chain routing decisions, and output quality across dozens or hundreds of runs.
Skill extraction: Failures and successes are distilled into atomic, reusable skills. For a LangChain agent, these might include:
- "When the search tool returns no results, try rephrasing the query with synonyms before falling back to a broader search"
- "For customer data lookups, always use the structured SQL tool rather than the general search tool"
- "When the user asks about pricing, retrieve the current pricing page first — don't rely on training data which may be outdated"
- "Multi-hop questions require breaking the query into sub-questions before calling tools — single-shot retrieval consistently fails for these"
Skillbook curation: Skills accumulate in a Skillbook with helpful/harmful vote counters. You review and approve skills before they affect the agent. Nothing changes without your sign-off.
Prompt generation: Approved skills are compiled into an improved system prompt. You paste this into your LangChain agent's prompt template — or use the Kayba dashboard to manage it.

No Code Changes Required

This is the key point for LangChain users. You don't need to:

Add Kayba as a dependency in your agent
Modify your chain or graph structure
Add custom callbacks or middleware
Change your tool definitions or retriever configuration

Kayba works with the traces your agent already produces. The output is an improved prompt that slots into your existing ChatPromptTemplate or SystemMessage.

pip install ace-framework

Example: Skillbook for a LangChain Support Agent

Consider a LangChain agent handling technical support tickets. It uses tools for searching a knowledge base, querying a ticket database, and generating responses. After analyzing 200 traces, the Skillbook might look like:

Skill	Section	Helpful	Harmful
When users reference error codes, search the knowledge base with the exact code before attempting a general search	Tool Usage	14	0
For billing questions, always check the customer's current plan before suggesting solutions	Domain Knowledge	11	1
If the knowledge base search returns more than 5 results, summarize the top 3 rather than listing all of them	Response Quality	9	0
Never suggest "reinstalling" as a first step — check for known issues matching the error first	Domain Knowledge	8	0
When the SQL query tool returns an empty result, the customer ID may be in the legacy system — try the legacy lookup tool	Tool Usage	7	0
Escalation is appropriate when the ticket involves data loss, security concerns, or account access issues	Decision Making	6	0

Each skill traces back to specific runs where the pattern was identified. You can audit the evidence, approve or reject the skill, and adjust wording before it enters the agent's prompt.

LangSmith + Kayba: The Complete Stack

LangSmith and Kayba are complementary, not competing. Together they form a closed loop:

LangSmith handles the observation layer:

Real-time trace logging and monitoring
Cost and latency tracking
Evaluation datasets and scoring
Debugging individual runs

Kayba handles the learning layer:

Analyzing patterns across many runs
Extracting reusable procedural knowledge
Generating improved prompts automatically
Building cumulative agent memory via the Skillbook

The workflow looks like this:

Your LangChain agent runs in production, traced by LangSmith
You export traces periodically (or on a schedule)
Kayba analyzes the traces with the Recursive Reflector
New skills are extracted and added to the Skillbook
You review and approve skills in the dashboard
Kayba generates an updated system prompt
You deploy the improved prompt to your LangChain agent
Performance improves, LangSmith confirms the gains, repeat

This is the observe, learn, improve cycle that turns a static agent into a self-improving one.

Works with Any LangChain Pattern

Kayba is framework-agnostic and works with every LangChain architecture:

Sequential chains — analyze how information flows through chain steps and where quality degrades
ReAct agents — learn better tool selection patterns and reasoning strategies
LangGraph state machines — identify which state transitions lead to failures and extract routing skills
Multi-agent systems — analyze inter-agent communication patterns and coordination failures
RAG pipelines — learn better retrieval strategies, chunking preferences, and reranking patterns

If your LangChain agent produces traces, Kayba can learn from them.

Getting Started

Install the framework:

pip install ace-framework

Export traces from your LangChain agent (LangSmith export, custom callbacks, or raw logs)
Run Kayba's analysis pipeline on your traces
Review extracted skills in the Skillbook
Generate and deploy an improved prompt

No changes to your LangChain agent code. No new dependencies in your agent's runtime. Just better prompts based on what actually happened.

Documentation — Setup guides and API reference
GitHub — Source code and examples
Dashboard — Hosted dashboard for visual Skillbook management and prompt generation