14 Feb 2026

Open Source LLM Observability: Tracing AI Calls with AgentGateway and Langfuse

Your AI agents are calling LLMs hundreds of times a day. Do you know what they’re sending? What they’re spending? Whether that prompt injection guard actually fired?

This guide shows how to wire Langfuse into AgentGateway to get full observability over every LLM and MCP tool call β€” zero application code changes required.


Why Gateway-Level Observability?

Most teams add tracing inside their application code β€” wrapping LLM SDK calls with Langfuse decorators or OpenTelemetry spans. This works, but it has gaps:

  • You only see what the app reports. If a developer forgets to instrument a call, it’s invisible.
  • Gateway-level policies are opaque. PII redaction, prompt injection blocking, rate limiting β€” these happen at the gateway. Application-level tracing can’t see them.
  • Multiple agents, multiple codebases. Every team has to add their own instrumentation. Different languages, different quality.

When tracing happens at the gateway layer, every request is captured automatically. Every agent, every provider, every tool call β€” same fidelity, same format, zero code changes.


Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Your App /   β”‚     β”‚   Solo AgentGateway     β”‚     β”‚   LLM Provider  β”‚
β”‚ AI Agent     │────▢│   (Gateway API)         │────▢│   (OpenAI, etc) β”‚
β”‚              β”‚     β”‚                         β”‚     β”‚                 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                 β”‚
                          OTLP Traces (gRPC)
                                 β”‚
                     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                     β”‚  OpenTelemetry          β”‚
                     β”‚  Collector (fan-out)    β”‚
                     β”‚                         β”‚
                     β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜
                           β”‚           β”‚
                    OTLP HTTP      OTLP gRPC
                           β”‚           β”‚
                     β”Œβ”€β”€β”€β”€β”€β–Όβ”€β”€β”  β”Œβ”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                     β”‚Langfuseβ”‚  β”‚ClickHouse /     β”‚
                     β”‚  UI    β”‚  β”‚ Solo Enterprise  β”‚
                     β”‚        β”‚  β”‚ UI (optional)    β”‚
                     β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

AgentGateway natively emits OpenTelemetry traces for every LLM request. A lightweight OTel Collector receives these traces and forwards them to Langfuse via OTLP HTTP. The same collector can fan-out traces to additional backends (ClickHouse, Jaeger, Datadog) simultaneously.


Quick Start: Kind Cluster + AgentGateway 2.1 OSS

Don’t have a cluster? Here’s the fastest path from zero to traced LLM calls.

Create the Cluster

kind create cluster --name agentgateway

Install Gateway API CRDs

kubectl apply -f https://github.com/kubernetes-sigs/gateway-api/releases/download/v1.4.0/standard-install.yaml

Install AgentGateway 2.1 OSS

helm upgrade -i agentgateway-crds oci://cr.agentgateway.dev/helm/agentgateway-crds \
  --version 2.1.0 \
  --namespace agentgateway-system \
  --create-namespace

# Install control plane
helm upgrade -i agentgateway oci://cr.agentgateway.dev/helm/agentgateway \
  --version 2.1.0 \
  --namespace agentgateway-system

Verify it’s running:

kubectl get pods -n agentgateway-system
kubectl get gatewayclass agentgateway

Setting Up Langfuse Tracing

Step 1: Get Langfuse API Keys

Sign up at cloud.langfuse.com (free tier) or use a self-hosted instance. Go to Settings β†’ API Keys and create a key pair.

Base64 encode your credentials:

echo -n "pk-lf-YOUR_PUBLIC_KEY:sk-lf-YOUR_SECRET_KEY" | base64

Step 2: Deploy the OTel Collector

The collector bridges AgentGateway’s OTLP gRPC output to Langfuse’s OTLP HTTP input:

apiVersion: v1
kind: ConfigMap
metadata:
  name: langfuse-otel-collector-config
  namespace: agentgateway-system
data:
  config.yaml: |
    receivers:
      otlp:
        protocols:
          grpc:
            endpoint: 0.0.0.0:4317
          http:
            endpoint: 0.0.0.0:4318
    exporters:
      otlphttp/langfuse:
        endpoint: http://cloud.langfuse.com/api/public/otel
        headers:
          Authorization: "Basic <YOUR_BASE64_CREDENTIALS>"
        retry_on_failure:
          enabled: true
          initial_interval: 5s
          max_interval: 30s
          max_elapsed_time: 300s
    processors:
      batch:
        send_batch_size: 1000
        timeout: 5s
    service:
      pipelines:
        traces:
          receivers: [otlp]
          processors: [batch]
          exporters: [otlphttp/langfuse]    
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: langfuse-otel-collector
  namespace: agentgateway-system
  labels:
    app: langfuse-otel-collector
spec:
  replicas: 1
  selector:
    matchLabels:
      app: langfuse-otel-collector
  template:
    metadata:
      labels:
        app: langfuse-otel-collector
    spec:
      containers:
        - name: otel-collector
          image: otel/opentelemetry-collector-contrib:0.132.1
          args: ["--config=/conf/config.yaml"]
          ports:
            - containerPort: 4317
              name: otlp-grpc
            - containerPort: 4318
              name: otlp-http
          volumeMounts:
            - name: config
              mountPath: /conf
          resources:
            requests:
              cpu: 50m
              memory: 128Mi
            limits:
              cpu: 200m
              memory: 256Mi
      volumes:
        - name: config
          configMap:
            name: langfuse-otel-collector-config
---
apiVersion: v1
kind: Service
metadata:
  name: langfuse-otel-collector
  namespace: agentgateway-system
spec:
  selector:
    app: langfuse-otel-collector
  ports:
    - name: otlp-grpc
      port: 4317
      targetPort: 4317
    - name: otlp-http
      port: 4318
      targetPort: 4318
kubectl apply -f langfuse-collector.yaml

Step 3: Configure AgentGateway Tracing

For AgentGateway OSS, enable tracing via Helm values:

# values-tracing.yaml
gateway:
  envs:
    OTEL_EXPORTER_OTLP_ENDPOINT: "http://langfuse-otel-collector.agentgateway-system.svc.cluster.local:4317"
    OTEL_EXPORTER_OTLP_PROTOCOL: "grpc"
helm upgrade agentgateway oci://cr.agentgateway.dev/helm/agentgateway \
  --version 2.1.0 \
  --namespace agentgateway-system \
  -f values-tracing.yaml

For AgentGateway Enterprise, use the EnterpriseAgentgatewayParameters resource instead:

apiVersion: enterpriseagentgateway.solo.io/v1alpha1
kind: EnterpriseAgentgatewayParameters
metadata:
  name: tracing
  namespace: agentgateway-system
spec:
  rawConfig:
    config:
      tracing:
        otlpEndpoint: grpc://langfuse-otel-collector.agentgateway-system.svc.cluster.local:4317
        otlpProtocol: grpc
        randomSampling: true
        fields:
          add:
            gen_ai.operation.name: '"chat"'
            gen_ai.system: "llm.provider"
            gen_ai.request.model: "llm.requestModel"
            gen_ai.response.model: "llm.responseModel"
            gen_ai.usage.prompt_tokens: "llm.inputTokens"
            gen_ai.usage.completion_tokens: "llm.outputTokens"
            gen_ai.usage.total_tokens: "llm.totalTokens"
            gen_ai.request.temperature: "llm.params.temperature"
            gen_ai.prompt: "llm.prompt"
            gen_ai.completion: "llm.completion"

Step 4: Create a Gateway and Route

apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: ai-gateway
  namespace: agentgateway-system
spec:
  gatewayClassName: agentgateway
  listeners:
    - name: llm
      port: 8080
      protocol: HTTPS
      allowedRoutes:
        namespaces:
          from: Same
---
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: openai
  namespace: agentgateway-system
spec:
  parentRefs:
    - name: ai-gateway
  rules:
    - matches:
        - path:
            type: PathPrefix
            value: /openai
      backendRefs:
        - group: agentgateway.dev
          kind: AgentgatewayBackend
          name: openai
---
apiVersion: agentgateway.dev/v1alpha1
kind: AgentgatewayBackend
metadata:
  name: openai
  namespace: agentgateway-system
spec:
  type: llm
  llm:
    provider:
      openai:
        authToken:
          secretRef:
            name: openai-api-key
            namespace: agentgateway-system
# Create the API key secret
kubectl create secret generic openai-api-key \
  -n agentgateway-system \
  --from-literal=Authorization="Bearer $OPENAI_API_KEY"

# Apply the gateway and route
kubectl apply -f gateway.yaml

Step 5: Test and View Traces

# Port-forward the gateway
kubectl port-forward -n agentgateway-system svc/ai-gateway 8080:8080 &

# Send a test request
curl -X POST http://localhost:8080/openai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4.1-mini",
    "messages": [{"role": "user", "content": "Hello from AgentGateway!"}]
  }'

Open Langfuse β†’ Traces. You should see a trace with model, tokens, prompt, completion, and gateway metadata.


What Gets Captured

Trace Attributes (GenAI Semantic Conventions)

AttributeDescriptionExample
gen_ai.systemLLM provideropenai
gen_ai.request.modelRequested modelgpt-4.1-mini
gen_ai.response.modelActual model usedgpt-4o-mini-2024-07-18
gen_ai.usage.prompt_tokensInput tokens13
gen_ai.usage.completion_tokensOutput tokens30
gen_ai.usage.total_tokensCombined43
gen_ai.promptFull prompt content[{"role":"user","content":"Hello"}]
gen_ai.completionFull completionHello! How can I help you?
gen_ai.streamingStreaming usedfalse

Gateway Metadata

AttributeDescriptionExample
gatewayGateway resourceagentgateway-system/ai-gateway
routeHTTPRoute nameagentgateway-system/openai
endpointBackend endpointapi.openai.com:443
listenerGateway listenerllm

Multi-Provider Support

AgentGateway traces all providers through the same pipeline β€” OpenAI, Anthropic, xAI/Grok, Azure OpenAI, Google Gemini, Ollama, and any OpenAI-compatible API. Add more routes, same observability:

/openai/*    β†’ OpenAI GPT    β†’ traced to Langfuse
/anthropic/* β†’ Anthropic     β†’ traced to Langfuse
/xai/*       β†’ xAI Grok     β†’ traced to Langfuse

Every provider, same trace format, same dashboard.


Fan-Out: Langfuse + Additional Backends

Send traces to multiple backends simultaneously:

exporters:
  otlphttp/langfuse:
    endpoint: http://cloud.langfuse.com/api/public/otel
    headers:
      Authorization: "Basic <CREDENTIALS>"
  otlp/jaeger:
    endpoint: jaeger-collector:4317
    tls:
      insecure: true

service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [batch]
      exporters: [otlphttp/langfuse, otlp/jaeger]

Common fan-out targets: Langfuse (LLM analytics) + ClickHouse (gateway metrics) + Jaeger (distributed tracing) + Datadog (enterprise monitoring).


MCP Tool Tracing

AgentGateway doesn’t just trace LLM calls β€” it also traces MCP (Model Context Protocol) tool interactions:

  • Tool discovery (tools/list) β€” which tools are available, how long discovery takes
  • Tool execution (tools/call) β€” parameters, results, latency
  • Backend MCP server performance β€” per-server latency and error rates

When an agent calls Slack, GitHub, or any MCP tool server through AgentGateway, the full tool call chain appears in Langfuse alongside the LLM calls that triggered it.


Security Policy Visibility

When AgentGateway’s security policies fire, the trace metadata includes what happened:

  • PII Protection β€” how many entities were redacted, what types (email, SSN, phone)
  • Prompt Injection β€” whether an injection was detected and blocked
  • Credential Leak β€” whether secrets were caught in the LLM response
  • Rate Limiting β€” remaining quota for the user

This means you can see not just what your agents are doing, but what guardrails are protecting them.


Troubleshooting

ProblemFix
No traces in LangfuseCheck collector pod is running: kubectl get pods -n agentgateway-system -l app=langfuse-otel-collector
401 errors in collector logsWrong Langfuse API credentials β€” re-check the base64 encoding
Traces show but no prompt/completionAdd the fields.add section in Enterprise, or check OTEL env vars in OSS
Missing gateway metadataRestart proxies after config change: kubectl rollout restart deployment -n agentgateway-system

Full Source Code

All manifests, configs, and examples are in the agentgateway-langfuse repository:

PathDescription
docs/quickstart-kind.mdFull kind cluster quickstart guide
examples/basic/Basic Langfuse collector + tracing config
examples/fan-out/Fan-out to Langfuse + additional backends
examples/argocd/Production ArgoCD/GitOps deployment
scripts/verify.shEnd-to-end verification script

Resources