Skip to main content

Command Palette

Search for a command to run...

Getting Started with kgateway

Published
8 min read

What is kgateway, and Why Should You Care?

If you're building cloud-native applications, you've probably encountered:

  • Managing API traffic between services

  • Routing requests to different LLM providers (OpenAI, Claude, etc.)

  • Implement ratelimiting on expensive AI API calls

  • Adding Authentication without Touching Application Code

kgateway is a Kubernetes-native gateway that solves all of these issues with the standard Gateway API.

What makes it special?

  • Dual Control Plane: One controller handles traditional API traffic (Envoy) as well as AI/LLM-related traffic (agentgateway)

  • Kubernetes-Native: Leverages standard Gateway API resources rather than custom CRDs

  • AI First Features: Includes native support for OpenAI, Anthropic, AWS Bedrock, and custom AI backends

  • Observability: Provides native Open Telemetry support for cost tracking and performance monitoring

In this guide, we'll build three real world scenarios, and why these scenarios? becuase these are the most common cases you’ll need kGateway for, so we are covering

  1. Basic API Gateway - Route HTTP traffic to microservices

  2. AI Gateway - Proxy requests to OpenAI with rate limiting

  3. Secure AI Gateway - Add authentication and cost controls


Prerequisites

What you are going to need:

  • A Kubernetes cluster , you can use kind or minikube for this guide

  • kubectl configured to access your cluster

  • helm 3.0+

  • (Optional) An OpenAI API key for AI Gateway examples


Simple Installation

Step 1: Install Gateway API CRDs

kgateway uses the standard Kubernetes Gateway API:

kubectl apply -f https://github.com/kubernetes-sigs/gateway-api/releases/download/v1.2.1/standard-install.yaml

So what does is that it Installs the Gateway, GatewayClass, and HTTPRoute CRDs that are standard across all Gateway API implementations.


Step 2: Install kgateway

# Install kgateway CRDs
helm upgrade -i --create-namespace --namespace kgateway-system \
  kgateway-crds \
  oci://cr.kgateway.dev/kgateway-dev/charts/kgateway-crds \
  --version v2.3.0

# Install kgateway controller with AI Gateway support
helm upgrade -i --namespace kgateway-system \
  kgateway \
  oci://cr.kgateway.dev/kgateway-dev/charts/kgateway \
  --version v2.3.0 \
  --set agentgateway.enabled=true

Wait for pods to be ready:

kubectl wait --for=condition=ready pod \
  -l app=kgateway \
  -n kgateway-system \
  --timeout=120s

Verify installation:

kubectl get gatewayclass

Expected output:

NAME            CONTROLLER                      ACCEPTED   AGE
kgateway        kgateway.dev/kgateway          True       30s
agentgateway    agentgateway.dev/agentgateway  True       30s

Here as you can see you now have TWO gateway controllers:

  • kgateway - For traditional HTTP/HTTPS traffic (uses Envoy)

  • agentgateway - For AI/LLM traffic (uses agentgateway proxy)


Scenario 1: Traditional API Gateway

Let’s say you have a microservices app and want to expose services through a single entry point with routing rules , how can you do that? Let’s see…..

Deploy a Sample Application

Let's use httpbin as our backend service:

kubectl create namespace demo

kubectl apply -n demo -f - <<EOF
apiVersion: v1
kind: Service
metadata:
  name: httpbin
spec:
  selector:
    app: httpbin
  ports:
  - port: 8000
    targetPort: 8080
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: httpbin
spec:
  replicas: 1
  selector:
    matchLabels:
      app: httpbin
  template:
    metadata:
      labels:
        app: httpbin
    spec:
      containers:
      - name: httpbin
        image: mccutchen/go-httpbin:v2.6.0
        ports:
        - containerPort: 8080
EOF

Create a Gateway

kubectl apply -f - <<EOF
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: api-gateway
  namespace: demo
spec:
  gatewayClassName: kgateway  # Using Envoy-based gateway
  listeners:
  - name: http
    protocol: HTTP
    port: 8080
    allowedRoutes:
      namespaces:
        from: All
EOF

What this creates:

  • A Gateway resource that kgateway watches

  • An Envoy proxy deployment (check with kubectl get pods -n demo)

  • A LoadBalancer service (or ClusterIP in local clusters)


Create Routing Rules

Now let's route traffic to our httpbin service:

kubectl apply -f - <<EOF
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: httpbin-route
  namespace: demo
spec:
  parentRefs:
  - name: api-gateway
  rules:
  - matches:
    - path:
        type: PathPrefix
        value: /api
    filters:
    - type: URLRewrite
      urlRewrite:
        path:
          type: ReplacePrefixMatch
          replacePrefixMatch: /
    backendRefs:
    - name: httpbin
      port: 8000
EOF

What this does really:

  • Routes requests to /api/* → httpbin service

  • Rewrites /api/get/get before sending to backend

  • Standard Gateway API syntax (portable across implementations!)


Test It

# Port-forward to the gateway
kubectl port-forward -n demo svc/api-gateway-http 8080:8080

# In another terminal
curl http://localhost:8080/api/get

Expected response:

{
  "args": {},
  "headers": {
    "Host": "localhost:8080",
    "User-Agent": "curl/8.0.0"
  },
  "origin": "127.0.0.1",
  "url": "http://localhost:8080/get"
}

YAYYYY!!!! You've just routed traffic through kgateway.


Add Advanced Routing

Let's add header-based routing and rate limiting:

kubectl apply -f - <<EOF
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: advanced-route
  namespace: demo
spec:
  parentRefs:
  - name: api-gateway
  rules:
  # Rule 1: Route /premium/* for premium users
  - matches:
    - path:
        type: PathPrefix
        value: /premium
      headers:
      - name: X-User-Tier
        value: premium
    backendRefs:
    - name: httpbin
      port: 8000
  # Rule 2: Default route with rate limiting
  - matches:
    - path:
        type: PathPrefix
        value: /
    backendRefs:
    - name: httpbin
      port: 8000
---
apiVersion: gateway.kgateway.dev/v1alpha1
kind: TrafficPolicy
metadata:
  name: rate-limit-policy
  namespace: demo
spec:
  targetRefs:
  - group: gateway.networking.k8s.io
    kind: HTTPRoute
    name: advanced-route
  rateLimit:
    requests: 10
    unit: MINUTE
EOF

Test the rate limit:

# This should succeed
for i in {1..10}; do curl -s http://localhost:8080/api/get | jq -r '.url'; done

# This should fail with 429 Too Many Requests
curl -s http://localhost:8080/api/get

Scenario 2: AI Gateway (The Cool Part!)

Use Case: You're building an AI-powered app and want to:

  • Route requests to OpenAI

  • Add rate limiting to control costs

  • Track usage with observability

Create an AI Gateway

kubectl apply -f - <<EOF
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: ai-gateway
  namespace: demo
spec:
  gatewayClassName: agentgateway  # AI Gateway!
  listeners:
  - name: http
    protocol: HTTP
    port: 8080
EOF

Key difference: We use agentgateway GatewayClass instead of kgateway. This deploys the AI-optimized proxy.


Configure OpenAI Backend

First, create a secret with your OpenAI API key:

kubectl create secret generic openai-secret -n demo \
  --from-literal="Authorization=Bearer sk-YOUR-OPENAI-API-KEY"

Now create an AgentgatewayBackend:

kubectl apply -f - <<EOF
apiVersion: agentgateway.dev/v1alpha1
kind: AgentgatewayBackend
metadata:
  name: openai-gpt4
  namespace: demo
spec:
  ai:
    provider:
      openai:
        model: "gpt-4o-mini"
  policies:
    auth:
      secretRef:
        name: openai-secret
EOF

What this does:

  • Defines OpenAI as an AI backend

  • Configures the model (gpt-4o-mini)

  • Attaches authentication from the secret


Route Traffic to OpenAI

kubectl apply -f - <<EOF
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: openai-route
  namespace: demo
spec:
  parentRefs:
  - name: ai-gateway
  rules:
  - matches:
    - path:
        type: PathPrefix
        value: /v1/chat/completions
    backendRefs:
    - name: openai-gpt4
      group: agentgateway.dev
      kind: AgentgatewayBackend
EOF

Test the AI Gateway

kubectl port-forward -n demo svc/ai-gateway-http 8080:8080

curl http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [
      {
        "role": "user",
        "content": "Explain kgateway in one sentence"
      }
    ]
  }'

Expected response:

{
  "id": "chatcmpl-...",
  "object": "chat.completion",
  "choices": [
    {
      "message": {
        "role": "assistant",
        "content": "kgateway is a Kubernetes-native gateway that manages both traditional API traffic and AI/LLM requests through a unified control plane."
      }
    }
  ]
}

YAYYY AGAIN!!! You just proxied an OpenAI request through kgateway!


Add AI-Specific Features

1. Rate Limiting for Cost Control

kubectl apply -f - <<EOF
apiVersion: agentgateway.dev/v1alpha1
kind: AgentgatewayPolicy
metadata:
  name: cost-control
  namespace: demo
spec:
  targetRefs:
  - group: gateway.networking.k8s.io
    kind: HTTPRoute
    name: openai-route
  backend:
    ai:
      rateLimit:
        requestsPerMinute: 10  # Max 10 requests/min
        tokensPerMinute: 10000 # Max 10k tokens/min
EOF

2. Model Aliasing

Allow users to reference models by friendly names:

kubectl apply -f - <<EOF
apiVersion: agentgateway.dev/v1alpha1
kind: AgentgatewayPolicy
metadata:
  name: model-aliases
  namespace: demo
spec:
  targetRefs:
  - group: gateway.networking.k8s.io
    kind: HTTPRoute
    name: openai-route
  backend:
    ai:
      modelAliases:
        "fast": "gpt-4o-mini"
        "smart": "gpt-4o"
        "reasoning": "o1-preview"
EOF

Now users can request model: "fast" instead of remembering gpt-4o-mini.

3. Multi-Provider Routing

Route to different providers based on the model:

# Create AWS Bedrock backend
kubectl apply -f - <<EOF
apiVersion: agentgateway.dev/v1alpha1
kind: AgentgatewayBackend
metadata:
  name: bedrock-claude
  namespace: demo
spec:
  ai:
    provider:
      bedrock:
        model: "anthropic.claude-3-5-sonnet-20241022-v2:0"
        region: "us-west-2"
  policies:
    auth:
      secretRef:
        name: bedrock-secret
---
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: multi-provider-route
  namespace: demo
spec:
  parentRefs:
  - name: ai-gateway
  rules:
  # Route GPT requests to OpenAI
  - matches:
    - headers:
      - name: X-Model-Provider
        value: openai
    backendRefs:
    - name: openai-gpt4
      group: agentgateway.dev
      kind: AgentgatewayBackend
  # Route Claude requests to Bedrock
  - matches:
    - headers:
      - name: X-Model-Provider
        value: anthropic
    backendRefs:
    - name: bedrock-claude
      group: agentgateway.dev
      kind: AgentgatewayBackend
EOF

Scenario 3: Secure AI Gateway with Authentication

Use Case: You want to add API key authentication WITHOUT modifying your application code.

Add API Key Authentication

kubectl apply -f - <<EOF
apiVersion: v1
kind: Secret
metadata:
  name: api-keys
  namespace: demo
stringData:
  api-keys: |
    user1:hashed-key-here
    user2:another-hashed-key
---
apiVersion: gateway.kgateway.dev/v1alpha1
kind: TrafficPolicy
metadata:
  name: api-key-auth
  namespace: demo
spec:
  targetRefs:
  - group: gateway.networking.k8s.io
    kind: HTTPRoute
    name: openai-route
  extAuth:
    apiKey:
      secretRef:
        name: api-keys
      headerName: X-API-Key
EOF

Test authentication:

# This fails (401 Unauthorized)
curl http://localhost:8080/v1/chat/completions

# This succeeds
curl http://localhost:8080/v1/chat/completions \
  -H "X-API-Key: user1-key"

Add OIDC Authentication

For production, use OIDC (Google, Okta, etc.):

kubectl apply -f - <<EOF
apiVersion: gateway.kgateway.dev/v1alpha1
kind: GatewayExtension
metadata:
  name: google-oauth
  namespace: demo
spec:
  oauth2:
    issuerURI: https://accounts.google.com
    credentials:
      clientID: "your-client-id"
      clientSecretRef:
        name: google-oauth-secret
    redirectURI: https://your-domain.com/oauth2/redirect
    scopes: ["openid", "email"]
---
apiVersion: gateway.kgateway.dev/v1alpha1
kind: TrafficPolicy
metadata:
  name: oauth-policy
  namespace: demo
spec:
  targetRefs:
  - group: gateway.networking.k8s.io
    kind: HTTPRoute
    name: openai-route
  oauth2:
    extensionRef:
      name: google-oauth
EOF

Now users must authenticate via Google before accessing OpenAI!


Observability: Tracking AI Costs

kgateway includes built-in OpenTelemetry support. Let's enable it:

kubectl apply -f - <<EOF
apiVersion: agentgateway.dev/v1alpha1
kind: AgentgatewayParameters
metadata:
  name: observability-config
  namespace: demo
spec:
  logging:
    format: json
  rawConfig:
    config:
      tracing:
        otlpEndpoint: http://jaeger-collector:4317
        otlpProtocol: grpc
        randomSampling: true
        fields:
          add:
            gen_ai.operation.name: '"chat"'
            gen_ai.system: "llm.provider"
            gen_ai.request.model: "llm.request_model"
            gen_ai.usage.completion_tokens: "llm.output_tokens"
            gen_ai.usage.prompt_tokens: "llm.input_tokens"
EOF

What you get:

  • Every AI request is traced

  • Token usage tracked (input + output)

  • Provider and model information

  • Cost calculation (tokens × provider rates)


What's Next?

Now that you are no longer a stranger to kGateway I would recommend you to dive deeper into it , you can try it’s Advanced AI features like Prompt Caching , Tool/Function calling or the vision API Support. You can also try CNCF Ecosystem Integrations like Argo Rollouts for progressive LLM provide rollouts (PS- and leave feedback here if you face any roadblock) ….. also feel free to check out official documentation and API Spec


Resources