Skip to main content

Middleware Stack

The gateway applies middleware in a strict order. The order matters — each middleware depends on context set by the ones before it.

Full Chain

Request

├── 1. RequestID ─ Generate unique X-Request-ID
├── 2. RealIP ─ Extract real client IP (behind proxies)
├── 3. OpenTelemetry ─ Start distributed trace span
├── 4. Logger ─ Structured access logging (Zerolog)
├── 5. Recoverer ─ Recover from panics → 500 response
├── 6. CORS ─ Cross-origin resource sharing headers
├── 7. RateLimiter ─ Per-user + per-IP token bucket

└── Auth Group (authenticated routes only)
├── 8. ValidateJWT ─ Verify Clerk JWT signature + expiry
├── 9. InjectUserClaims ─ Set X-User-Id, X-User-Plan headers
└── 10. Entitlement ─ Check feature access (plan-gated routes)


Proxy to downstream service

Global Middleware (All Routes)

1. RequestID

Generates a unique X-Request-ID for every incoming request. If the client sends one, it's preserved; otherwise a new UUID is generated.

r.Use(middleware.RequestID)

This ID propagates to all downstream services and appears in every log line, enabling distributed trace correlation.

2. RealIP

Extracts the real client IP address from X-Forwarded-For or X-Real-Ip headers when behind a reverse proxy or load balancer.

r.Use(middleware.RealIP)

3. OpenTelemetry

Starts a trace span for the incoming request. Traces are exported to the configured OTLP collector (Grafana Alloy in production).

r.Use(otelchi.Middleware("gateway"))

4. Logger

Structured access logging using Zerolog. Logs method, path, status code, duration, and request ID as JSON.

r.Use(middleware.Logger)

Example log output:

{
"level": "info",
"method": "GET",
"path": "/api/v1/passages/gen.1.1",
"status": 200,
"duration_ms": 12,
"request_id": "abc-123",
"timestamp": "2026-03-12T10:30:00Z"
}

5. Recoverer

Catches any panics in downstream handlers and returns a 500 Internal Server Error instead of crashing the process.

r.Use(middleware.Recoverer)

6. CORS

Configures Cross-Origin Resource Sharing for web and mobile clients:

r.Use(cors.Handler(cors.Options{
AllowedOrigins: cfg.CORSOrigins, // e.g. ["http://localhost:3002"]
AllowedMethods: []string{"GET", "POST", "PUT", "DELETE", "OPTIONS"},
AllowedHeaders: []string{"Authorization", "Content-Type", "X-Request-ID"},
ExposedHeaders: []string{"X-Request-ID"},
AllowCredentials: true,
MaxAge: 300,
}))

7. RateLimiter

Token-bucket rate limiting with per-user and per-IP strategies. Rate limit counters are stored in Redis.

r.Use(ratelimit.Middleware(cfg))

When a limit is exceeded, the response includes standard headers:

HTTP/1.1 429 Too Many Requests
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1710243600
Retry-After: 45

Auth Group (Authenticated Routes Only)

8. ValidateJWT

Validates the Authorization: Bearer <token> header against Clerk's JWKS endpoint. Checks signature, expiry, and issuer.

r.Use(authmw.ValidateJWT(cfg))

If validation fails, the request is rejected with a 401 Unauthorized error envelope.

9. InjectUserClaims

Extracts user claims from the validated JWT and injects them as headers for downstream services:

HeaderSourceExample
X-User-IdJWT sub claimuser_abc123
X-User-PlanJWT custom claimscholar
r.Use(authmw.InjectUserClaims)

10. Entitlement

Applied only to plan-gated route groups. Checks whether the user's plan includes the required feature via a Redis cache lookup.

r.Use(entitlement.Require("ai_features"))

See Routing > Entitlement Middleware for details.