Getting Started with kgateway
What is kgateway, and Why Should You Care?
If you're building cloud-native applications, you've probably encountered:
Managing API traffic between services
Routing requests to different LLM providers (OpenAI, Claude, etc.)
Implement ratelimiting on expensive AI API calls
Adding Authentication without Touching Application Code
kgateway is a Kubernetes-native gateway that solves all of these issues with the standard Gateway API.
What makes it special?
Dual Control Plane: One controller handles traditional API traffic (Envoy) as well as AI/LLM-related traffic (agentgateway)
Kubernetes-Native: Leverages standard Gateway API resources rather than custom CRDs
AI First Features: Includes native support for OpenAI, Anthropic, AWS Bedrock, and custom AI backends
Observability: Provides native Open Telemetry support for cost tracking and performance monitoring
In this guide, we'll build three real world scenarios, and why these scenarios? becuase these are the most common cases you’ll need kGateway for, so we are covering
Basic API Gateway - Route HTTP traffic to microservices
AI Gateway - Proxy requests to OpenAI with rate limiting
Secure AI Gateway - Add authentication and cost controls
Prerequisites
What you are going to need:
A Kubernetes cluster , you can use kind or minikube for this guide
kubectlconfigured to access your clusterhelm3.0+(Optional) An OpenAI API key for AI Gateway examples
Simple Installation
Step 1: Install Gateway API CRDs
kgateway uses the standard Kubernetes Gateway API:
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api/releases/download/v1.2.1/standard-install.yaml
So what does is that it Installs the Gateway, GatewayClass, and HTTPRoute CRDs that are standard across all Gateway API implementations.
Step 2: Install kgateway
# Install kgateway CRDs
helm upgrade -i --create-namespace --namespace kgateway-system \
kgateway-crds \
oci://cr.kgateway.dev/kgateway-dev/charts/kgateway-crds \
--version v2.3.0
# Install kgateway controller with AI Gateway support
helm upgrade -i --namespace kgateway-system \
kgateway \
oci://cr.kgateway.dev/kgateway-dev/charts/kgateway \
--version v2.3.0 \
--set agentgateway.enabled=true
Wait for pods to be ready:
kubectl wait --for=condition=ready pod \
-l app=kgateway \
-n kgateway-system \
--timeout=120s
Verify installation:
kubectl get gatewayclass
Expected output:
NAME CONTROLLER ACCEPTED AGE
kgateway kgateway.dev/kgateway True 30s
agentgateway agentgateway.dev/agentgateway True 30s
Here as you can see you now have TWO gateway controllers:
kgateway- For traditional HTTP/HTTPS traffic (uses Envoy)agentgateway- For AI/LLM traffic (uses agentgateway proxy)
Scenario 1: Traditional API Gateway
Let’s say you have a microservices app and want to expose services through a single entry point with routing rules , how can you do that? Let’s see…..
Deploy a Sample Application
Let's use httpbin as our backend service:
kubectl create namespace demo
kubectl apply -n demo -f - <<EOF
apiVersion: v1
kind: Service
metadata:
name: httpbin
spec:
selector:
app: httpbin
ports:
- port: 8000
targetPort: 8080
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: httpbin
spec:
replicas: 1
selector:
matchLabels:
app: httpbin
template:
metadata:
labels:
app: httpbin
spec:
containers:
- name: httpbin
image: mccutchen/go-httpbin:v2.6.0
ports:
- containerPort: 8080
EOF
Create a Gateway
kubectl apply -f - <<EOF
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
name: api-gateway
namespace: demo
spec:
gatewayClassName: kgateway # Using Envoy-based gateway
listeners:
- name: http
protocol: HTTP
port: 8080
allowedRoutes:
namespaces:
from: All
EOF
What this creates:
A Gateway resource that kgateway watches
An Envoy proxy deployment (check with
kubectl get pods -n demo)A LoadBalancer service (or ClusterIP in local clusters)
Create Routing Rules
Now let's route traffic to our httpbin service:
kubectl apply -f - <<EOF
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: httpbin-route
namespace: demo
spec:
parentRefs:
- name: api-gateway
rules:
- matches:
- path:
type: PathPrefix
value: /api
filters:
- type: URLRewrite
urlRewrite:
path:
type: ReplacePrefixMatch
replacePrefixMatch: /
backendRefs:
- name: httpbin
port: 8000
EOF
What this does really:
Routes requests to
/api/*→ httpbin serviceRewrites
/api/get→/getbefore sending to backendStandard Gateway API syntax (portable across implementations!)
Test It
# Port-forward to the gateway
kubectl port-forward -n demo svc/api-gateway-http 8080:8080
# In another terminal
curl http://localhost:8080/api/get
Expected response:
{
"args": {},
"headers": {
"Host": "localhost:8080",
"User-Agent": "curl/8.0.0"
},
"origin": "127.0.0.1",
"url": "http://localhost:8080/get"
}
YAYYYY!!!! You've just routed traffic through kgateway.
Add Advanced Routing
Let's add header-based routing and rate limiting:
kubectl apply -f - <<EOF
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: advanced-route
namespace: demo
spec:
parentRefs:
- name: api-gateway
rules:
# Rule 1: Route /premium/* for premium users
- matches:
- path:
type: PathPrefix
value: /premium
headers:
- name: X-User-Tier
value: premium
backendRefs:
- name: httpbin
port: 8000
# Rule 2: Default route with rate limiting
- matches:
- path:
type: PathPrefix
value: /
backendRefs:
- name: httpbin
port: 8000
---
apiVersion: gateway.kgateway.dev/v1alpha1
kind: TrafficPolicy
metadata:
name: rate-limit-policy
namespace: demo
spec:
targetRefs:
- group: gateway.networking.k8s.io
kind: HTTPRoute
name: advanced-route
rateLimit:
requests: 10
unit: MINUTE
EOF
Test the rate limit:
# This should succeed
for i in {1..10}; do curl -s http://localhost:8080/api/get | jq -r '.url'; done
# This should fail with 429 Too Many Requests
curl -s http://localhost:8080/api/get
Scenario 2: AI Gateway (The Cool Part!)
Use Case: You're building an AI-powered app and want to:
Route requests to OpenAI
Add rate limiting to control costs
Track usage with observability
Create an AI Gateway
kubectl apply -f - <<EOF
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
name: ai-gateway
namespace: demo
spec:
gatewayClassName: agentgateway # AI Gateway!
listeners:
- name: http
protocol: HTTP
port: 8080
EOF
Key difference: We use agentgateway GatewayClass instead of kgateway. This deploys the AI-optimized proxy.
Configure OpenAI Backend
First, create a secret with your OpenAI API key:
kubectl create secret generic openai-secret -n demo \
--from-literal="Authorization=Bearer sk-YOUR-OPENAI-API-KEY"
Now create an AgentgatewayBackend:
kubectl apply -f - <<EOF
apiVersion: agentgateway.dev/v1alpha1
kind: AgentgatewayBackend
metadata:
name: openai-gpt4
namespace: demo
spec:
ai:
provider:
openai:
model: "gpt-4o-mini"
policies:
auth:
secretRef:
name: openai-secret
EOF
What this does:
Defines OpenAI as an AI backend
Configures the model (gpt-4o-mini)
Attaches authentication from the secret
Route Traffic to OpenAI
kubectl apply -f - <<EOF
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: openai-route
namespace: demo
spec:
parentRefs:
- name: ai-gateway
rules:
- matches:
- path:
type: PathPrefix
value: /v1/chat/completions
backendRefs:
- name: openai-gpt4
group: agentgateway.dev
kind: AgentgatewayBackend
EOF
Test the AI Gateway
kubectl port-forward -n demo svc/ai-gateway-http 8080:8080
curl http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o-mini",
"messages": [
{
"role": "user",
"content": "Explain kgateway in one sentence"
}
]
}'
Expected response:
{
"id": "chatcmpl-...",
"object": "chat.completion",
"choices": [
{
"message": {
"role": "assistant",
"content": "kgateway is a Kubernetes-native gateway that manages both traditional API traffic and AI/LLM requests through a unified control plane."
}
}
]
}
YAYYY AGAIN!!! You just proxied an OpenAI request through kgateway!
Add AI-Specific Features
1. Rate Limiting for Cost Control
kubectl apply -f - <<EOF
apiVersion: agentgateway.dev/v1alpha1
kind: AgentgatewayPolicy
metadata:
name: cost-control
namespace: demo
spec:
targetRefs:
- group: gateway.networking.k8s.io
kind: HTTPRoute
name: openai-route
backend:
ai:
rateLimit:
requestsPerMinute: 10 # Max 10 requests/min
tokensPerMinute: 10000 # Max 10k tokens/min
EOF
2. Model Aliasing
Allow users to reference models by friendly names:
kubectl apply -f - <<EOF
apiVersion: agentgateway.dev/v1alpha1
kind: AgentgatewayPolicy
metadata:
name: model-aliases
namespace: demo
spec:
targetRefs:
- group: gateway.networking.k8s.io
kind: HTTPRoute
name: openai-route
backend:
ai:
modelAliases:
"fast": "gpt-4o-mini"
"smart": "gpt-4o"
"reasoning": "o1-preview"
EOF
Now users can request model: "fast" instead of remembering gpt-4o-mini.
3. Multi-Provider Routing
Route to different providers based on the model:
# Create AWS Bedrock backend
kubectl apply -f - <<EOF
apiVersion: agentgateway.dev/v1alpha1
kind: AgentgatewayBackend
metadata:
name: bedrock-claude
namespace: demo
spec:
ai:
provider:
bedrock:
model: "anthropic.claude-3-5-sonnet-20241022-v2:0"
region: "us-west-2"
policies:
auth:
secretRef:
name: bedrock-secret
---
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: multi-provider-route
namespace: demo
spec:
parentRefs:
- name: ai-gateway
rules:
# Route GPT requests to OpenAI
- matches:
- headers:
- name: X-Model-Provider
value: openai
backendRefs:
- name: openai-gpt4
group: agentgateway.dev
kind: AgentgatewayBackend
# Route Claude requests to Bedrock
- matches:
- headers:
- name: X-Model-Provider
value: anthropic
backendRefs:
- name: bedrock-claude
group: agentgateway.dev
kind: AgentgatewayBackend
EOF
Scenario 3: Secure AI Gateway with Authentication
Use Case: You want to add API key authentication WITHOUT modifying your application code.
Add API Key Authentication
kubectl apply -f - <<EOF
apiVersion: v1
kind: Secret
metadata:
name: api-keys
namespace: demo
stringData:
api-keys: |
user1:hashed-key-here
user2:another-hashed-key
---
apiVersion: gateway.kgateway.dev/v1alpha1
kind: TrafficPolicy
metadata:
name: api-key-auth
namespace: demo
spec:
targetRefs:
- group: gateway.networking.k8s.io
kind: HTTPRoute
name: openai-route
extAuth:
apiKey:
secretRef:
name: api-keys
headerName: X-API-Key
EOF
Test authentication:
# This fails (401 Unauthorized)
curl http://localhost:8080/v1/chat/completions
# This succeeds
curl http://localhost:8080/v1/chat/completions \
-H "X-API-Key: user1-key"
Add OIDC Authentication
For production, use OIDC (Google, Okta, etc.):
kubectl apply -f - <<EOF
apiVersion: gateway.kgateway.dev/v1alpha1
kind: GatewayExtension
metadata:
name: google-oauth
namespace: demo
spec:
oauth2:
issuerURI: https://accounts.google.com
credentials:
clientID: "your-client-id"
clientSecretRef:
name: google-oauth-secret
redirectURI: https://your-domain.com/oauth2/redirect
scopes: ["openid", "email"]
---
apiVersion: gateway.kgateway.dev/v1alpha1
kind: TrafficPolicy
metadata:
name: oauth-policy
namespace: demo
spec:
targetRefs:
- group: gateway.networking.k8s.io
kind: HTTPRoute
name: openai-route
oauth2:
extensionRef:
name: google-oauth
EOF
Now users must authenticate via Google before accessing OpenAI!
Observability: Tracking AI Costs
kgateway includes built-in OpenTelemetry support. Let's enable it:
kubectl apply -f - <<EOF
apiVersion: agentgateway.dev/v1alpha1
kind: AgentgatewayParameters
metadata:
name: observability-config
namespace: demo
spec:
logging:
format: json
rawConfig:
config:
tracing:
otlpEndpoint: http://jaeger-collector:4317
otlpProtocol: grpc
randomSampling: true
fields:
add:
gen_ai.operation.name: '"chat"'
gen_ai.system: "llm.provider"
gen_ai.request.model: "llm.request_model"
gen_ai.usage.completion_tokens: "llm.output_tokens"
gen_ai.usage.prompt_tokens: "llm.input_tokens"
EOF
What you get:
Every AI request is traced
Token usage tracked (input + output)
Provider and model information
Cost calculation (tokens × provider rates)
What's Next?
Now that you are no longer a stranger to kGateway I would recommend you to dive deeper into it , you can try it’s Advanced AI features like Prompt Caching , Tool/Function calling or the vision API Support. You can also try CNCF Ecosystem Integrations like Argo Rollouts for progressive LLM provide rollouts (PS- and leave feedback here if you face any roadblock) ….. also feel free to check out official documentation and API Spec