⬅️ Previous - LangGraph Writer-Critic Loop
➡️ Next - Power of Tools
LangSmith gives you full visibility into your agentic systems. It traces every node, logs every LLM and tool call, captures inputs and outputs, and shows exactly how your state evolved — all in a clean, visual UI.
It works seamlessly with LangGraph and LangChain, requires minimal setup, and becomes essential as your systems grow more complex.
Let’s say you built a chatbot to help customers track their orders. One day, a user asks:
“Where’s my package?”
And your bot replies:
“Dolphins are mammals that live in the ocean.” 🐬
Not great.
Now imagine you're trying to debug this. What went wrong?
If you’re flying blind — no logs, no state trace, no prompt capture — you’re left guessing.
That’s not engineering. That’s gambling.
LangSmith changes this.
It logs every decision your system makes — from inputs to LLM prompts to tool calls to final output. It shows you the trace. It shows you the cost. It shows you what actually happened.
And starting now, it’s going to be part of every system you build.
LangSmith is a platform that brings observability to language model–powered systems. It tracks everything your application does — from the moment a user sends input, to the final output, and every step in between.
At its core, LangSmith is a tracing and evaluation layer for LLM workflows. Whether you're using a simple chain or a complex multi-agent graph, it records:
You can think of LangSmith as a debugging lens for your AI application. It doesn’t just tell you what your system did — it shows you how it got there, why it chose that path, and whether it met your expectations.
LangSmith integrates seamlessly with LangChain and LangGraph, but it also works with direct OpenAI calls and other LLM providers. Whether you’re building workflows, agents, or end-user apps, LangSmith gives you the transparency and control you need to iterate with confidence.
LangSmith gives you full visibility into how your system behaves — from input to output, and everything in between.
LangSmith doesn’t just collect logs — it gives you the tools to understand your system’s behavior and improve it with confidence.
Full-system traceability. Every action your system takes — LLM calls, tool invocations, node transitions — is logged in sequence, so you can see how things actually ran.
Live state visibility. For each step, LangSmith shows you the exact input, output, and state updates — letting you debug without guesswork.
Visual timeline. Traces are displayed as interactive, navigable timelines. You can inspect branches, loops, retries, and conditional paths at a glance.
Performance and cost insights. LangSmith tracks token usage, latency, and per-call costs. No more wondering where the slowdowns or expenses are coming from.
Automated evaluations. With built-in LLM-powered grading, you can define test sets and measure output quality over time — no manual spot-checking required.
Searchable, organized runs. Tag, filter, and explore past runs by user, version, or use case — perfect for regression testing, debugging, or collaboration.
LangSmith turns observability into an asset, not just an afterthought.
In this walkthrough, we put LangSmith to work on the same agent you just built in previous lessons— and show how it helps you:
If you're curious about debugging, monitoring, or evaluating your agentic systems at scale, this video will show you why LangSmith is the default choice.
LangSmith is designed to be frictionless. If you're using LangChain or LangGraph, you don’t need to change a single line of code — just set a few environment variables. If you're working outside those frameworks, setup is still just a few lines.
pip install -U langchain langchain-openai
export LANGSMITH_TRACING=true export LANGSMITH_ENDPOINT="https://api.smith.langchain.com" export LANGSMITH_API_KEY="<your-api-key>" export LANGSMITH_PROJECT="your-project-name" export OPENAI_API_KEY="<your-openai-api-key>"
Any LLM, ChatModel, Chain, or LangGraph execution will automatically show up in LangSmith:
from langchain_openai import ChatOpenAI llm = ChatOpenAI() llm.invoke("Hello, world!")
🎉 That’s it. You’re now logging full traces into LangSmith.
If you’re working directly with OpenAI (or other LLMs), you can still get full tracing with just two imports and one decorator.
pip install -U langsmith
export LANGSMITH_TRACING=true export LANGSMITH_ENDPOINT="https://api.smith.langchain.com" export LANGSMITH_API_KEY="<your-api-key>" export LANGSMITH_PROJECT="your-project-name" export OPENAI_API_KEY="<your-openai-api-key>"
import openai from langsmith.wrappers import wrap_openai from langsmith import traceable # Auto-trace OpenAI calls client = wrap_openai(openai.Client()) @traceable # Auto-trace this function def pipeline(user_input: str): result = client.chat.completions.create( messages=[{"role": "user", "content": user_input}], model="gpt-3.5-turbo" ) return result.choices[0].message.content pipeline("Hello, world!")
From this point on, LangSmith will be enabled by default in the program — so every graph, agent, or LLM call you build will be fully traceable, inspectable, and ready for debugging.
💡 Note: We’re using OpenAI here for illustration, but LangSmith supports tracing with any LLM provider. If your client supports standard HTTP-based API calls, you can wrap and trace it the same way.
Once tracing is enabled, LangSmith captures everything your system does — not just the final output, but the full path it took to get there. To make this easy to navigate, LangSmith breaks the activity into three key layers of observability:
At the top level, a trace represents a full interaction — typically one user input or system operation. This might be a single LLM call, or a multi-step graph execution. The following is a view of a trace from LangSmith:
Inside a trace, each run represents one step in the process — like calling a tool, executing a node in LangGraph, invoking an LLM, or running a chain. A run is where decision points and state transitions happen. Here we see a runs dashboard:
Within each run, LangSmith also logs calls — the actual LLM or API calls that were made. You’ll see exactly what was sent, what came back, how long it took, and how many tokens it used. Here we can visualize individual calls and their metrics:
This layered view gives you both high-level insight and step-by-step details:
LangSmith automatically stitches this all together — so you can reconstruct what happened, understand why, and improve what comes next.
Once you can trace what your system is doing, the next question is:
Is it doing it well?
That’s where LangSmith’s Evaluation Mode comes in.
With Evaluation Mode, you can define test cases — a prompt and an expected behavior — and let LangSmith automatically score your system’s responses using an LLM. This makes it easy to assess quality at scale, catch regressions, and compare versions.
You can evaluate for:
You can even use your own evaluation prompts and grading criteria. Behind the scenes, LangSmith uses the same models you already work with to judge your outputs — so evaluation stays aligned with the capabilities of your system.
You’re not just building a system that works — you’re building one you can trust.
Evaluation Mode gives you a way to measure that, every step of the way.
Not necessarily.
Most LLM provider SDKs already give you the basics:
You’ll see prompts, responses, token usage, and maybe some latency stats or error traces.
If your system is just a few isolated API calls — that might be enough.
But once you’re working with agentic workflows, things get more complex.
You’ll need to track state changes, conditional routing, retries, loops, tool calls, and multi-step flows. That’s when things get messy.
Without LangSmith, you’d be looking at:
It’s not impossible — but it’s brittle, time-consuming, and hard to scale.
LangSmith handles all of that for you.
It gives you structured, end-to-end traces, a visual timeline of every run, searchable metadata, cost tracking, and LLM-based evaluation tools — all integrated cleanly with LangChain and LangGraph.
So no, you don’t have to use LangSmith.
But once your system gets interesting, it’s hard to imagine not using it.
You’ve now seen what LangSmith brings to the table - full visibility into how your agentic system behaves, step by step, with zero guesswork.
From here on out, LangSmith will be enabled by default in all your projects.
Every LLM call, tool invocation, node transition, and state update - fully traceable.
You’re not just building AI systems.
You’re building inspectable, debuggable, production-ready ones.
⬅️ Previous - LangGraph Writer-Critic Loop
➡️ Next - Power of Tools