← All articles

First Steps: agentgateway, F5 AI Guardrails, and the Enterprise UI

Share

The previous post covered the new hard spend limits in Enterprise agentgateway v2026.6.3: model cost catalogs, dollar or token budgets, and a real 429 when a budget is exhausted. That solves the FinOps side of AI traffic. The next question is the one security teams ask immediately after: what prevents a prompt, response, or agent workflow from leaking something it should not?

This is the first practical setup I use for that conversation:

  • agentgateway remains the AI data plane: one OpenAI-compatible front door, routes, backends, enterprise policies, traces, and cost/token metadata.
  • F5 AI Guardrails is the AI security decision point: scanners, redaction, blocking, and audit history.
  • Solo Enterprise UI for agentgateway gives the platform view: routes, destinations, policies, playground access, and traces from the gateway.

The goal is not just to return the right HTTP status code. The goal is to make the setup inspectable: security sees the guardrail decision in F5, platform sees the gateway route and policy in the agentgateway UI, and application teams keep calling one OpenAI-compatible endpoint.

The shape of the demo

I deploy two F5 integration patterns side by side:

RoutePatternWhat happens
/option-aagentgateway in front of F5 inline Guardrailsagentgateway forwards to F5’s OpenAI-compatible /openai/{provider}/chat/completions endpoint. F5 scans and makes the final provider call.
/option-cagentgateway with out-of-band F5 ScanAPIagentgateway calls OpenAI directly, but request and response promptGuard webhooks call a small adapter that sends text to F5 ScanAPI.

I also add three native agentgateway Enterprise policy routes:

RoutePurpose
/agw/directdirect response generated by the gateway
/agw/corsCORS and response header policy
/agw/rate-limitlocal rate limiting before provider/backend traffic

That gives the demo two useful proofs at once: F5 is enforcing AI security policy, and agentgateway Enterprise is enforcing gateway-native traffic policy.

Step 1: install Enterprise agentgateway

The demo runs on a disposable kind cluster and installs Enterprise agentgateway v2026.6.3.

The important environment variables are:

AGENTGATEWAY_LICENSE_KEY='...'
OPENAI_API_KEY='sk-...'

F5_AISEC_URL='https://www.us2.calypsoai.app'
F5_AISEC_TOKEN='...'
F5_AISEC_INLINE_PROVIDER='genai-azure-openai'
CAI_PROJECT='Global-...'

OPTION_A_MODEL='gpt-4.1'
OPTION_C_MODEL='gpt-5.5'

I keep these in .env, which is ignored by Git. The F5 token is used in two places: setup-time scanner creation and runtime calls from the in-cluster adapter.

The gateway itself is standard Kubernetes Gateway API:

apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: agentgateway-proxy
  namespace: agentgateway-system
spec:
  gatewayClassName: enterprise-agentgateway
  listeners:
    - name: http
      protocol: HTTP
      port: 80
      allowedRoutes:
        namespaces:
          from: All

Everything else attaches to that gateway: the F5 inline backend, the direct OpenAI backend, the promptGuard policy, the UI tracing policy, and the native Enterprise policy demos.

Step 2: configure F5 AI Guardrails

The setup script creates a practical scanner set in F5. I started with two simple controls and then expanded it so the demo behaves more like an actual security review:

ScannerTypeModeDirection
agw-lab-keyword-codenameKeyword, project-titanBlockPrompts and responses
agw-lab-regex-ssnRegexRedactPrompts
agw-lab-regex-emailRegexRedactPrompts
agw-lab-regex-phoneRegexRedactPrompts
agw-lab-regex-api-keyRegexRedactPrompts
agw-lab-regex-jwtRegexRedactPrompts
agw-lab-regex-private-keyRegexRedactPrompts
agw-lab-keyword-prompt-injectionKeywordBlockPrompts
agw-lab-keyword-secret-exfiltrationKeywordBlockPrompts
agw-lab-regex-codename-obfuscationRegexBlockPrompts and responses

Here is that scanner set in the F5 UI:

F5 AI Guardrails custom guardrails list showing the agentgateway lab scanners for codename blocking, prompt-injection blocking, secret-exfiltration blocking, and regex redaction for private keys, JWTs, API keys, phone numbers, emails, and SSNs.

The setup script validates F5 access, resolves the project, confirms the inline provider exists, creates or reuses the scanners, attaches them to the project, and then runs quick ScanAPI checks:

./setup-guardrails.sh

For production, the direction column matters. In this demo, PII redactors are prompt-side controls and the codename controls run both ways. If you need PII redaction on model output as well, make those scanners direction: "both" and rerun the setup before you call the deployment production-ready.

Step 3: Option A, F5 inline behind agentgateway

Option A is the fastest path because F5 already exposes an OpenAI-compatible endpoint. agentgateway treats that endpoint like a custom OpenAI provider.

apiVersion: agentgateway.dev/v1alpha1
kind: AgentgatewayBackend
metadata:
  name: f5-guardrails-inline
  namespace: agentgateway-system
spec:
  ai:
    provider:
      openai:
        model: "__OPTION_A_MODEL__"
      host: "__F5_AISEC_HOST__"
      port: 443
      path: "/openai/__F5_AISEC_INLINE_PROVIDER__/chat/completions"
  policies:
    auth:
      secretRef:
        name: calypsoai-token
    tls:
      sni: "__F5_AISEC_HOST__"

The app calls:

POST /option-a

agentgateway receives the OpenAI Chat Completions request, applies its route and backend policy, and forwards to F5. F5 scans the prompt and response and owns the final provider hop.

Use this pattern when you want a fast proof that the products work together and the security team is comfortable owning the final provider connection in F5.

Step 4: Option C, F5 ScanAPI as a promptGuard webhook

Option C keeps agentgateway as the only inference path. F5 does not proxy the LLM request. It only renders a verdict through ScanAPI.

The agentgateway policy targets the /option-c HTTPRoute:

apiVersion: enterpriseagentgateway.solo.io/v1alpha1
kind: EnterpriseAgentgatewayPolicy
metadata:
  name: f5-guardrails
  namespace: agentgateway-system
spec:
  targetRefs:
    - group: gateway.networking.k8s.io
      kind: HTTPRoute
      name: option-c
  backend:
    ai:
      promptGuard:
        request:
          - webhook:
              backendRef:
                kind: Service
                name: f5-guardrails-adapter
                port: 8000
              failureMode: FailClosed
            response:
              message: "Blocked by F5 AI Guardrails"
              statusCode: 403
        response:
          - webhook:
              backendRef:
                kind: Service
                name: f5-guardrails-adapter
                port: 8000
              failureMode: FailClosed

The adapter is intentionally small. It receives the webhook body, extracts the prompt or assistant response, calls:

POST /backend/v1/scans

with:

{
  "input": "text to scan",
  "project": "Global-...",
  "scanDirection": "request",
  "flagOnly": false,
  "verbose": true
}

Then it maps F5 outcomes back to agentgateway actions:

F5 outcomeRequest webhookResponse webhook
clearpasspass
blocked / flagged / rejectedreject with 403mask assistant content
redactedInput returnedreplace the last user messagereplace assistant content
ScanAPI errorfail closed with 503fail closed

This is the shape I prefer for production. agentgateway keeps routing, failover, budgets, provider credentials, and traces. F5 keeps scanner policy, redaction decisions, and audit evidence.

Step 5: install the Solo Enterprise UI

This is the part that makes the demo much easier to explain. The UI install is not an afterthought; it is part of the deployment.

I install the management chart at 0.4.7 with the agentgateway product enabled:

helm upgrade -i management \
  oci://us-docker.pkg.dev/solo-public/solo-enterprise-helm/charts/management \
  --namespace agentgateway-system \
  --create-namespace \
  --version 0.4.7 \
  --set cluster="mgmt-cluster" \
  --set products.agentgateway.enabled=true \
  --set-string licensing.licenseKey="${AGENTGATEWAY_LICENSE_KEY}"

For a demo, I leave SOLO_UI_OIDC_ISSUER empty so the chart’s built-in auto-auth path is used. For a real environment, wire it to your IdP and provide the backend client secret through a Kubernetes Secret.

The install gives me a solo-enterprise-ui service:

kubectl get svc -n agentgateway-system solo-enterprise-ui

The two useful local forwards are:

kubectl port-forward -n agentgateway-system svc/agentgateway-proxy 8080:80
kubectl port-forward -n agentgateway-system svc/solo-enterprise-ui 8090:80

Then open:

open http://localhost:8090

Step 6: turn on agentgateway traces for the UI

The UI becomes useful for traffic analysis when agentgateway emits OTLP traces to the management telemetry collector.

That is a small Enterprise policy:

apiVersion: enterpriseagentgateway.solo.io/v1alpha1
kind: EnterpriseAgentgatewayPolicy
metadata:
  name: tracing
  namespace: agentgateway-system
spec:
  targetRefs:
    - group: gateway.networking.k8s.io
      kind: Gateway
      name: agentgateway-proxy
  frontend:
    tracing:
      backendRef:
        name: solo-enterprise-telemetry-collector
        namespace: agentgateway-system
        kind: Service
        port: 4317
      randomSampling: "true"

Verify that the policy attached:

kubectl get enterpriseagentgatewaypolicy tracing \
  -n agentgateway-system \
  -o yaml

After traffic runs, the proxy logs include trace.id and span.id fields on requests. Those are the breadcrumbs that connect gateway behavior to UI trace views.

This is the policy inventory in the agentgateway UI:

Solo Enterprise for agentgateway policies view showing active EnterpriseAgentgateway policies for CORS and headers, direct response, local rate limit, F5 guardrails, and tracing.

And this is the destination inventory. The two AI backends are exactly the two patterns from the architecture: F5 inline and OpenAI direct.

Solo Enterprise for agentgateway destinations view showing the f5-guardrails-inline and openai-direct AI destinations in the agentgateway-system namespace.

Step 7: use the playground as a sanity check

The UI sees the routes from the cluster:

Solo Enterprise for agentgateway playground route selection showing agw-cors, agw-direct, agw-rate-limit, option-a, and option-c routes.

That is useful in a demo because you can explain the whole deployment without starting in YAML:

  • /option-a is the inline F5 path.
  • /option-c is the out-of-band ScanAPI path.
  • /agw/direct, /agw/cors, and /agw/rate-limit are native agentgateway Enterprise policy examples.

The UI is not a replacement for kubectl logs for raw pod stdout. Treat it as the topology and request-visibility layer: routes, destinations, policies, playground, and traces. Use Kubernetes logs for adapter exceptions and pod startup messages.

Step 8: prove enforcement with traffic

The smoke test sends six OpenAI Chat Completions requests:

./test.sh

The passing run:

Terminal output showing the agentgateway plus F5 smoke test passing: benign requests return 200, codename blocks return 400 or 403, SSN redaction succeeds, and response-phase scanning masks blocked output.

Text version:

PASS Option A benign: HTTP 200
PASS Option C benign: HTTP 200
PASS Option A blocked codename: HTTP 400
PASS Option C blocked codename: HTTP 403
PASS Option C SSN redaction request completed: HTTP 200
PASS Option C redaction did not leak raw SSN
PASS Option C response-phase scan completed: HTTP 200
PASS Option C response-phase scanner masked blocked output

That proves the data plane behavior:

  • benign traffic reaches the model
  • project-titan is blocked
  • SSN-shaped prompt content is redacted before forwarding
  • response content that trips the response scanner is masked before the client sees it

Step 9: confirm the F5 audit trail

The same traffic shows up in F5 under Logs -> Prompt history:

F5 AI Guardrails prompt history showing blocked project-titan scans, redacted SSN scans, and the inline Genai Azure Openai prompt path from the agentgateway demo.

This is the evidence split I want in the operating model:

  • agentgateway proves which route, policy, backend, status code, model, token count, cost, trace ID, and latency were involved.
  • F5 proves which scanner matched, whether content was blocked or redacted, who initiated it, and when the decision happened.

Those are different audit questions. Do not force one product to answer both.

The deployment checklist

For a fresh lab, the sequence is:

cp .env.example .env
# fill in AGENTGATEWAY_LICENSE_KEY, OPENAI_API_KEY, F5_AISEC_URL, F5_AISEC_TOKEN

./setup-guardrails.sh
./deploy.sh

kubectl port-forward -n agentgateway-system svc/agentgateway-proxy 8080:80
kubectl port-forward -n agentgateway-system svc/solo-enterprise-ui 8090:80

./test.sh
./test_agentgateway.sh
HARNESS_CASES=harness/intense-cases.yaml ./run_harness.sh

The checks I care about before showing this to anyone:

helm list -n agentgateway-system
kubectl get pods,svc -n agentgateway-system
kubectl get enterpriseagentgatewaypolicy -n agentgateway-system
kubectl logs -n agentgateway-system deploy/agentgateway-proxy --tail=80
kubectl logs -n agentgateway-system deploy/f5-guardrails-adapter --tail=80

The healthy state should include:

  • enterprise-agentgateway chart at v2026.6.3
  • management chart at 0.4.7
  • solo-enterprise-ui service present
  • solo-enterprise-telemetry-collector running
  • tracing policy accepted and attached
  • agentgateway request logs with trace.id and span.id

What this adds on top of hard spend limits

The budget article showed that agentgateway can stop runaway cost at the gateway. This setup adds the security controls around the same traffic:

  • budgets answer how much can this team spend?
  • guardrails answer is this prompt or response allowed?
  • routes and policies answer where is this traffic allowed to go?
  • traces answer what happened on this request?

The key is that these controls are attached to the gateway, not hand-coded in every application. Apps keep using normal OpenAI Chat Completions calls. Platform and security teams govern the path.

What I would harden next

For a production-grade rollout, I would tighten four things:

  1. Change PII redactors that must protect model output to direction: "both".
  2. Put the adapter behind real service-level observability and alert on fail-closed 503s.
  3. Wire the Enterprise UI to the corporate IdP instead of demo auto-auth.
  4. Combine this with EnterpriseAgentgatewayBudget so unsafe traffic and runaway spend are both blocked at the same front door.

That is the platform story: one gateway, separate controls, clear ownership, and enough visibility that you can prove what happened after the fact.

References