Infrastructure
GospeLib infrastructure is defined entirely as code across three layers: Terraform for AWS resources, Kustomize for Kubernetes manifests, and Docker Compose for local development. Nothing is configured manually in the AWS Console.
Terraform Layout
All Terraform configuration lives in infra/terraform/:
infra/terraform/
├── backend.tf # S3 state backend + DynamoDB locking
├── main.tf # Root module wiring
├── variables.tf # Shared variable definitions
├── outputs.tf # Exported values (IPs, endpoints, ARNs)
├── modules/
│ ├── ecr/ # Container registries (one per service)
│ ├── eks/ # EKS cluster + managed node groups (production)
│ ├── rds/ # PostgreSQL 16 + pgvector
│ ├── elasticache/ # Redis 7
│ ├── s3/ # Artifact and backup buckets
│ ├── cloudfront/ # CDN distributions
│ ├── route53/ # DNS records
│ └── secrets/ # AWS Secrets Manager entries
└── environments/
├── staging/ # t3.medium k3s, db.t3.micro, cache.t3.micro
└── production/ # EKS cluster, db.r6g.large, cache.r6g.large
Each environment directory contains its own terraform.tfvars (gitignored) and calls the shared modules with environment-specific sizing.
State Management
Terraform state is stored in S3 with DynamoDB locking to prevent concurrent modifications:
| Resource | Name | Purpose |
|---|---|---|
| S3 bucket | gospelib-terraform-state | Versioned state storage |
| DynamoDB table | gospelib-terraform-lock | Distributed lock |
State versioning is enabled on the S3 bucket, which provides the rollback path for infrastructure changes -- you can retrieve any previous state version and push it back.
Bootstrapping
Before terraform init can run, the state bucket and lock table must exist. These are the only resources created outside of Terraform. See Deploy to Staging for the exact commands, or refer to the source guide at docs/guides/terraform-bootstrap.md.
The bootstrap sequence:
- Create the S3 state bucket with versioning and encryption
- Create the DynamoDB lock table
- Create the GitHub Actions OIDC provider (no long-lived credentials)
- Create the IAM role with ECR and Secrets Manager permissions
- Create the EC2 SSH key pair
- Run
terraform initandterraform apply
GitHub Actions OIDC
The CD pipeline authenticates to AWS via OpenID Connect rather than stored access keys. A trust policy on the IAM role restricts access to the gospelib/main repository:
{
"Condition": {
"StringLike": {
"token.actions.githubusercontent.com:sub": "repo:gospelib/main:*"
}
}
}
The role (gospelib-github-actions) has two permission sets:
- ECR PowerUser -- push and pull container images
- Custom inline policy -- read secrets from Secrets Manager, read/write to S3 artifacts bucket
AWS Resource Map
Staging
| Resource | Type | Size | Monthly Cost |
|---|---|---|---|
| EC2 (k3s) | t3.medium | 2 vCPU, 4 GB | ~$30 |
| RDS PostgreSQL | db.t3.micro | 1 vCPU, 1 GB | Free tier / ~$15 |
| ElastiCache Redis | cache.t3.micro | 1 vCPU, 0.5 GB | Free tier / ~$13 |
| ECR | per-service repos | varies | ~$1 |
| Route53 | 1 hosted zone | -- | $0.50 |
| S3 (state + artifacts) | 2 buckets | -- | < $0.10 |
| Total | ~$2.50 (free tier) / ~$60 |
Production
| Resource | Type | Size | Monthly Cost |
|---|---|---|---|
| EKS cluster | managed | -- | ~$73 |
| EC2 nodes | 2x t3.medium | 2 vCPU, 4 GB each | ~$60 |
| RDS PostgreSQL | db.r6g.large | 2 vCPU, 16 GB | ~$50 |
| ElastiCache Redis | cache.r6g.large | 2 vCPU, 13 GB | ~$25 |
| CloudFront | 2 distributions | web + admin | ~$5 |
| Total | ~$213 |
Kubernetes Manifests
Kubernetes configuration uses Kustomize with a base + overlay pattern:
infra/k8s/
├── base/ # Shared manifests for all services
│ ├── gateway/
│ ├── content/
│ ├── auth/
│ ├── billing/
│ ├── ai/
│ ├── notifications/
│ └── monitoring/ # Prometheus alerts, Grafana config
├── overlays/
│ ├── staging/ # 1 replica, small resource limits
│ │ └── kustomization.yaml # Image tags updated by CD pipeline
│ └── production/ # 2 replicas, larger resource limits
│ └── kustomization.yaml
├── jobs/
│ ├── ingest-full.yaml # Full corpus ingest (destroys + rebuilds)
│ └── ingest-incremental.yaml # Delta ingest
└── argocd/
└── application.yaml # ArgoCD Application pointing to stage branch
The CD pipeline updates image tags in the overlay's kustomization.yaml via kustomize edit set image, commits the change, and ArgoCD picks it up automatically.
Staging vs. Production Differences
The Kustomize overlays differ only in:
| Parameter | Staging | Production |
|---|---|---|
| Namespace | gospelib-staging | gospelib-production |
| Replica count | 1 | 2 |
| CPU request | 50m | 100m |
| CPU limit | 200m | 500m |
| Memory request | 64Mi | 128Mi |
| Memory limit | 256Mi | 512Mi |
| Secrets source | kubectl create secret | External Secrets Operator (AWS Secrets Manager) |
Docker Compose (Local Development)
The local development stack runs through Docker Compose at infra/docker/compose.yml with dev overrides in compose.dev.yml. Services are organized into profiles so you can start only what you need.
Core Services (always started)
| Container | Image | Port | Purpose |
|---|---|---|---|
falkordb | falkordb/falkordb:v4.16.6 | 6379 | Scripture graph database |
postgres | pgvector/pgvector:pg16 | 5432 | User data, auth, billing |
Observability Profile
Started with pnpm infra:observability or --profile observability:
| Container | Port | Purpose |
|---|---|---|
| Grafana | 3000 | Dashboards and exploration |
| Prometheus | 9090 | Metrics backend |
| Loki | 3100 | Log aggregation |
| Tempo | (internal) | Distributed tracing |
| Pyroscope | 4040 | Continuous profiling |
| Alloy | 12345 (UI), 12347 (Faro) | Unified collection agent |
| FalkorDB Browser | 3004 | Graph visualization |
Resource Limits
FalkorDB is the heaviest local container, configured with a 16 GB memory limit to handle the full scripture graph (~24M nodes). All other containers have modest limits (256 MB or less).
FalkorDB and general Redis are separate instances. FalkorDB uses port 6379; if you also run a Redis cache locally, use port 6380 to avoid conflicts.
Secrets Management
Secrets follow different patterns per environment:
| Environment | Method | Details |
|---|---|---|
| Local | .env files | Checked into .env.example, actual values gitignored |
| CI | GitHub Actions secrets | Per-environment secret sets (staging, production) |
| Staging | kubectl create secret | Manual one-time creation during setup |
| Production | External Secrets Operator | Auto-synced from AWS Secrets Manager every hour |
Required secrets are documented in the Deployment Overview.
DNS and TLS
| Domain | Target | Environment |
|---|---|---|
staging.gospelib.com | EC2 Elastic IP | Staging web |
api-staging.gospelib.com | EC2 Elastic IP | Staging API |
gospelib.com | CloudFront | Production web |
api.gospelib.com | EKS ALB | Production API |
admin.gospelib.com | CloudFront | Production admin |
grafana.gospelib.com | EKS ALB | Observability |
TLS certificates are managed by cert-manager using Let's Encrypt with HTTP-01 challenges through the nginx ingress class. Certificates auto-renew 30 days before expiry, with an alert firing at 14 days if renewal fails.
Related Pages
- CI/CD Pipeline -- build and deployment automation
- Deploy to Staging -- step-by-step staging provisioning
- Deploy to Production -- EKS promotion with approval gates
- Monitoring & Observability -- the Grafana stack in depth
- Rollback Procedures -- infrastructure state recovery