Welcome to The Watch Tower
See every decision your AI makes in production β spot slow spans, find errors, track cost.
Climb the TowerThe Silent Failure
Without observability, your AI can fail quietly β users get bad answers and you're the last to know.
Trace Viewer: See Every Step
Each AI response is a waterfall of spans. Click a span to inspect it.
Cost Tracker: Know Your Spend
Token usage adds up fast. Observability tools give you real-time cost dashboards.
Mission: Find the Bottleneck Span
A user complained that the app felt slow. Look at this trace and click the bottleneck span!
Add observability to your stack πΊ
Four steps from blind to fully instrumented.
- 1
Install Langfuse
Self-host or use the cloud version. Integrates with LangChain, OpenAI SDK, and more.
pip install langfuse
- 2
Wrap your AI calls
Decorate your existing code with a single context manager.
from langfuse.openai import openai # That's it! Your calls are now traced response = openai.chat.completions.create( model="gpt-4o", messages=[{"role": "user", "content": prompt}] ) - 3
View traces & costs
Open the Langfuse dashboard to see every call, latency, and token spend.
# In your Langfuse dashboard: # Traces β filter by session, user, or date # Cost β daily/weekly token spend # Latency β p50 and p95 per endpoint
- 4
Set up alerts
Get notified when latency spikes, error rate rises, or cost exceeds a threshold.
# Via Langfuse API or webhook: { "metric": "p95_latency", "threshold_ms": 2000, "alert_email": "oncall@yourdomain.com" }
Chat with the Observer β¨
Questions about tracing, Langfuse, cost optimization, or production issues?
Watch Master Certified! π
You can now trace, monitor, and optimize AI systems in production. Next comes the Guardrails Gate, where you learn how to stop bad prompts, risky tool calls, and unsafe outputs.
Trace Dashboard
Deliverable: Log prompt, tool call, latency, and token usage for each run.
Stretch: Add one alert for latency spikes.
Complete the deliverable first, then unlock the stretch goal.