Open Source LLM Observability: Tracing AI Calls with AgentGateway and Langfuse
Introduction
You’ve got your AI gateway routing LLM traffic. But can you actually see what’s happening? Which models are being called, how many tokens are being consumed, what prompts are going in and what completions are coming back?
This guide shows you how to integrate Langfuse with Solo AgentGateway to get full observability over every LLM request β without touching your application code. We’ll go from zero to traced requests on a local kind cluster in under 10 minutes.
Why This Matters
AgentGateway already captures rich telemetry about every LLM request: model, tokens, latency, route, and security policy actions. By forwarding those traces to Langfuse, you get:
- Full prompt and completion visibility across all LLM providers (OpenAI, Anthropic, xAI, etc.)
- Token usage and cost tracking per model, route, and user
- Latency analysis with gateway-level metadata β which route, which backend, which policy fired
- Zero application changes β tracing happens at the gateway layer, not in your app
This is the power of observability at the infrastructure layer. Your developers don’t need to instrument anything. Every LLM call that flows through the gateway is automatically captured.
Architecture
Here’s what we’re building:
ββββββββββββββββ ββββββββββββββββββββββββββ βββββββββββββββββββ
β Your App / β β Solo AgentGateway β β LLM Provider β
β AI Agent ββββββΆβ (Gateway API) ββββββΆβ (OpenAI, etc) β
ββββββββββββββββ βββββββββββββ¬βββββββββββββ βββββββββββββββββββ
β
OTLP Traces (gRPC)
β
βββββββββββββΌβββββββββββββ
β OpenTelemetry β
β Collector β
βββββββ¬ββββββββββββ¬βββββββ
β β
OTLP HTTP OTLP gRPC
β β
βββββββΌβββ βββββββΌβββββββββββ
βLangfuseβ βOther backends β
β UI β β(Jaeger, etc.) β
ββββββββββ βββββββββββββββββββ
AgentGateway natively emits OpenTelemetry traces for every LLM request. A lightweight OTel Collector receives those traces and forwards them to Langfuse via OTLP HTTP. The collector can also fan-out to additional backends like Jaeger, Datadog, or ClickHouse if you need traces in multiple places.
Prerequisites
- Docker installed and running
- kind installed
- kubectl installed
- helm installed
- A Langfuse account (free tier works) or self-hosted instance
- An OpenAI API key (or any supported LLM provider)
Step 1: Create a Kind Cluster
kind create cluster --name agentgateway
kubectl cluster-info --context kind-agentgateway
Step 2: Install the Gateway API CRDs
AgentGateway uses the standard Kubernetes Gateway API for configuration:
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api/releases/download/v1.4.0/standard-install.yaml
Step 3: Install AgentGateway
Install the CRDs and control plane:
helm upgrade -i agentgateway-crds oci://cr.agentgateway.dev/helm/agentgateway-crds \
--version 2.1.0 \
--namespace agentgateway-system \
--create-namespace
helm upgrade -i agentgateway oci://cr.agentgateway.dev/helm/agentgateway \
--version 2.1.0 \
--namespace agentgateway-system
Verify everything is running:
kubectl get pods -n agentgateway-system
kubectl get gatewayclass agentgateway
Step 4: Get Your Langfuse Credentials
Log in to Langfuse and go to Settings β API Keys. Create a new key pair (or use an existing one). Base64 encode them for the collector config:
echo -n "pk-lf-YOUR_PUBLIC_KEY:sk-lf-YOUR_SECRET_KEY" | base64
Step 5: Deploy the OTel Collector
The collector bridges AgentGateway (OTLP gRPC) and Langfuse (OTLP HTTP). Create langfuse-collector.yaml:
apiVersion: v1
kind: ConfigMap
metadata:
name: langfuse-otel-collector-config
namespace: agentgateway-system
data:
config.yaml: |
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
exporters:
otlphttp/langfuse:
endpoint: https://cloud.langfuse.com/api/public/otel
headers:
Authorization: "Basic <YOUR_BASE64_CREDENTIALS>"
retry_on_failure:
enabled: true
initial_interval: 5s
max_interval: 30s
max_elapsed_time: 300s
processors:
batch:
send_batch_size: 1000
timeout: 5s
service:
pipelines:
traces:
receivers: [otlp]
processors: [batch]
exporters: [otlphttp/langfuse]
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: langfuse-otel-collector
namespace: agentgateway-system
labels:
app: langfuse-otel-collector
spec:
replicas: 1
selector:
matchLabels:
app: langfuse-otel-collector
template:
metadata:
labels:
app: langfuse-otel-collector
spec:
containers:
- name: otel-collector
image: otel/opentelemetry-collector-contrib:0.132.1
args: ["--config=/conf/config.yaml"]
ports:
- containerPort: 4317
name: otlp-grpc
- containerPort: 4318
name: otlp-http
volumeMounts:
- name: config
mountPath: /conf
resources:
requests:
cpu: 50m
memory: 128Mi
limits:
cpu: 200m
memory: 256Mi
volumes:
- name: config
configMap:
name: langfuse-otel-collector-config
---
apiVersion: v1
kind: Service
metadata:
name: langfuse-otel-collector
namespace: agentgateway-system
spec:
selector:
app: langfuse-otel-collector
ports:
- name: otlp-grpc
port: 4317
targetPort: 4317
- name: otlp-http
port: 4318
targetPort: 4318
Deploy it:
kubectl apply -f langfuse-collector.yaml
Step 6: Enable Tracing on AgentGateway
For AgentGateway OSS, configure tracing via Helm values. Create values-tracing.yaml:
gateway:
envs:
OTEL_EXPORTER_OTLP_ENDPOINT: "http://langfuse-otel-collector.agentgateway-system.svc.cluster.local:4317"
OTEL_EXPORTER_OTLP_PROTOCOL: "grpc"
Upgrade the installation:
helm upgrade agentgateway oci://cr.agentgateway.dev/helm/agentgateway \
--version 2.1.0 \
--namespace agentgateway-system \
-f values-tracing.yaml
Using AgentGateway Enterprise? Instead of environment variables, create an EnterpriseAgentgatewayParameters resource with full control over field mappings:
apiVersion: enterpriseagentgateway.solo.io/v1alpha1
kind: EnterpriseAgentgatewayParameters
metadata:
name: tracing
namespace: agentgateway-system
spec:
rawConfig:
config:
tracing:
otlpEndpoint: grpc://langfuse-otel-collector.agentgateway-system.svc.cluster.local:4317
otlpProtocol: grpc
randomSampling: true
fields:
add:
gen_ai.operation.name: '"chat"'
gen_ai.system: "llm.provider"
gen_ai.request.model: "llm.requestModel"
gen_ai.response.model: "llm.responseModel"
gen_ai.usage.prompt_tokens: "llm.inputTokens"
gen_ai.usage.completion_tokens: "llm.outputTokens"
gen_ai.usage.total_tokens: "llm.totalTokens"
gen_ai.request.temperature: "llm.params.temperature"
gen_ai.prompt: "llm.prompt"
gen_ai.completion: "llm.completion"
Step 7: Create a Gateway and LLM Route
Create the Gateway resource and an OpenAI route:
# gateway.yaml
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
name: ai-gateway
namespace: agentgateway-system
spec:
gatewayClassName: agentgateway
listeners:
- name: llm
port: 8080
protocol: HTTPS
allowedRoutes:
namespaces:
from: Same
# openai-route.yaml
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: openai
namespace: agentgateway-system
spec:
parentRefs:
- name: ai-gateway
rules:
- matches:
- path:
type: PathPrefix
value: /openai
backendRefs:
- group: agentgateway.dev
kind: AgentgatewayBackend
name: openai
---
apiVersion: agentgateway.dev/v1alpha1
kind: AgentgatewayBackend
metadata:
name: openai
namespace: agentgateway-system
spec:
type: llm
llm:
provider:
openai:
authToken:
secretRef:
name: openai-api-key
namespace: agentgateway-system
Create the secret and apply everything:
kubectl create secret generic openai-api-key \
-n agentgateway-system \
--from-literal=Authorization="Bearer $OPENAI_API_KEY"
kubectl apply -f gateway.yaml
kubectl apply -f openai-route.yaml
Step 8: Test It
Port-forward the gateway and send a request:
kubectl port-forward -n agentgateway-system svc/ai-gateway 8080:8080 &
curl -X POST http://localhost:8080/openai/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4.1-mini",
"messages": [{"role": "user", "content": "Hello from AgentGateway!"}]
}'
Step 9: View Traces in Langfuse
Open your Langfuse UI and navigate to Traces. You should see a new trace with the full picture:
- Model and provider β which model actually served the request
- Token counts β input, output, and total tokens consumed
- Full prompt and completion β the exact content sent and received
- Gateway metadata β route name, backend endpoint, listener
What Gets Captured
AgentGateway follows the OpenTelemetry GenAI semantic conventions, so every trace includes structured attributes:
| Attribute | What It Tells You |
|---|---|
gen_ai.system | LLM provider (openai, anthropic, etc.) |
gen_ai.request.model | The model you asked for |
gen_ai.response.model | The model that actually responded |
gen_ai.usage.prompt_tokens | Input tokens consumed |
gen_ai.usage.completion_tokens | Output tokens generated |
gen_ai.prompt | Full prompt content |
gen_ai.completion | Full completion content |
gateway | AgentGateway resource name |
route | HTTPRoute that matched |
endpoint | Backend LLM endpoint |
Going Further
Fan-Out to Multiple Backends
The OTel Collector makes it easy to send traces to Langfuse and another backend simultaneously:
exporters:
otlphttp/langfuse:
endpoint: https://cloud.langfuse.com/api/public/otel
headers:
Authorization: "Basic <CREDENTIALS>"
otlp/jaeger:
endpoint: jaeger-collector:4317
tls:
insecure: true
service:
pipelines:
traces:
receivers: [otlp]
processors: [batch]
exporters: [otlphttp/langfuse, otlp/jaeger]
MCP Tool Tracing
AgentGateway also traces MCP (Model Context Protocol) traffic. When an agent discovers or invokes tools through an MCP server proxied by AgentGateway, you’ll see tool discovery requests, execution calls with parameters and results, and backend server latency β all as spans within the same trace.
Security Policy Visibility
When AgentGateway’s security policies fire β PII protection, prompt injection detection, credential leak prevention β the trace metadata shows which policy triggered, what action was taken (block, mask, allow), and what pattern matched. You get observability into both your LLM interactions and your security guardrails in one place.
Troubleshooting
No traces appearing?
- Check the collector is running:
kubectl get pods -n agentgateway-system -l app=langfuse-otel-collector - Check collector logs for errors:
kubectl logs -n agentgateway-system -l app=langfuse-otel-collector - Wrong API keys show up as 401 errors in collector logs
- Make sure proxies were restarted after configuring tracing
Traces visible in gateway logs but not Langfuse?
- Langfuse only supports OTLP HTTP, not gRPC β that’s why the collector is needed for protocol conversion
- Verify the exporter endpoint URL includes
/api/public/otel - Double-check the Base64 credentials format:
base64(public_key:secret_key)
Incomplete trace data?
- For Enterprise, ensure
fields.addincludes the GenAI attribute mappings - Set
randomSampling: trueto capture all requests during testing
Cleanup
kind delete cluster --name agentgateway