Skip to main content

Infrastructure

GospeLib infrastructure is defined entirely as code across three layers: Terraform for AWS resources, Kustomize for Kubernetes manifests, and Docker Compose for local development. Nothing is configured manually in the AWS Console.

Terraform Layout

All Terraform configuration lives in infra/terraform/:

infra/terraform/
├── backend.tf # S3 state backend + DynamoDB locking
├── main.tf # Root module wiring
├── variables.tf # Shared variable definitions
├── outputs.tf # Exported values (IPs, endpoints, ARNs)
├── modules/
│ ├── ecr/ # Container registries (one per service)
│ ├── eks/ # EKS cluster + managed node groups (production)
│ ├── rds/ # PostgreSQL 16 + pgvector
│ ├── elasticache/ # Redis 7
│ ├── s3/ # Artifact and backup buckets
│ ├── cloudfront/ # CDN distributions
│ ├── route53/ # DNS records
│ └── secrets/ # AWS Secrets Manager entries
└── environments/
├── staging/ # t3.medium k3s, db.t3.micro, cache.t3.micro
└── production/ # EKS cluster, db.r6g.large, cache.r6g.large

Each environment directory contains its own terraform.tfvars (gitignored) and calls the shared modules with environment-specific sizing.

State Management

Terraform state is stored in S3 with DynamoDB locking to prevent concurrent modifications:

ResourceNamePurpose
S3 bucketgospelib-terraform-stateVersioned state storage
DynamoDB tablegospelib-terraform-lockDistributed lock

State versioning is enabled on the S3 bucket, which provides the rollback path for infrastructure changes -- you can retrieve any previous state version and push it back.

Bootstrapping

Before terraform init can run, the state bucket and lock table must exist. These are the only resources created outside of Terraform. See Deploy to Staging for the exact commands, or refer to the source guide at docs/guides/terraform-bootstrap.md.

The bootstrap sequence:

  1. Create the S3 state bucket with versioning and encryption
  2. Create the DynamoDB lock table
  3. Create the GitHub Actions OIDC provider (no long-lived credentials)
  4. Create the IAM role with ECR and Secrets Manager permissions
  5. Create the EC2 SSH key pair
  6. Run terraform init and terraform apply

GitHub Actions OIDC

The CD pipeline authenticates to AWS via OpenID Connect rather than stored access keys. A trust policy on the IAM role restricts access to the gospelib/main repository:

{
"Condition": {
"StringLike": {
"token.actions.githubusercontent.com:sub": "repo:gospelib/main:*"
}
}
}

The role (gospelib-github-actions) has two permission sets:

  • ECR PowerUser -- push and pull container images
  • Custom inline policy -- read secrets from Secrets Manager, read/write to S3 artifacts bucket

AWS Resource Map

Staging

ResourceTypeSizeMonthly Cost
EC2 (k3s)t3.medium2 vCPU, 4 GB~$30
RDS PostgreSQLdb.t3.micro1 vCPU, 1 GBFree tier / ~$15
ElastiCache Rediscache.t3.micro1 vCPU, 0.5 GBFree tier / ~$13
ECRper-service reposvaries~$1
Route531 hosted zone--$0.50
S3 (state + artifacts)2 buckets--< $0.10
Total~$2.50 (free tier) / ~$60

Production

ResourceTypeSizeMonthly Cost
EKS clustermanaged--~$73
EC2 nodes2x t3.medium2 vCPU, 4 GB each~$60
RDS PostgreSQLdb.r6g.large2 vCPU, 16 GB~$50
ElastiCache Rediscache.r6g.large2 vCPU, 13 GB~$25
CloudFront2 distributionsweb + admin~$5
Total~$213

Kubernetes Manifests

Kubernetes configuration uses Kustomize with a base + overlay pattern:

infra/k8s/
├── base/ # Shared manifests for all services
│ ├── gateway/
│ ├── content/
│ ├── auth/
│ ├── billing/
│ ├── ai/
│ ├── notifications/
│ └── monitoring/ # Prometheus alerts, Grafana config
├── overlays/
│ ├── staging/ # 1 replica, small resource limits
│ │ └── kustomization.yaml # Image tags updated by CD pipeline
│ └── production/ # 2 replicas, larger resource limits
│ └── kustomization.yaml
├── jobs/
│ ├── ingest-full.yaml # Full corpus ingest (destroys + rebuilds)
│ └── ingest-incremental.yaml # Delta ingest
└── argocd/
└── application.yaml # ArgoCD Application pointing to stage branch

The CD pipeline updates image tags in the overlay's kustomization.yaml via kustomize edit set image, commits the change, and ArgoCD picks it up automatically.

Staging vs. Production Differences

The Kustomize overlays differ only in:

ParameterStagingProduction
Namespacegospelib-staginggospelib-production
Replica count12
CPU request50m100m
CPU limit200m500m
Memory request64Mi128Mi
Memory limit256Mi512Mi
Secrets sourcekubectl create secretExternal Secrets Operator (AWS Secrets Manager)

Docker Compose (Local Development)

The local development stack runs through Docker Compose at infra/docker/compose.yml with dev overrides in compose.dev.yml. Services are organized into profiles so you can start only what you need.

Core Services (always started)

ContainerImagePortPurpose
falkordbfalkordb/falkordb:v4.16.66379Scripture graph database
postgrespgvector/pgvector:pg165432User data, auth, billing

Observability Profile

Started with pnpm infra:observability or --profile observability:

ContainerPortPurpose
Grafana3000Dashboards and exploration
Prometheus9090Metrics backend
Loki3100Log aggregation
Tempo(internal)Distributed tracing
Pyroscope4040Continuous profiling
Alloy12345 (UI), 12347 (Faro)Unified collection agent
FalkorDB Browser3004Graph visualization

Resource Limits

FalkorDB is the heaviest local container, configured with a 16 GB memory limit to handle the full scripture graph (~24M nodes). All other containers have modest limits (256 MB or less).

warning

FalkorDB and general Redis are separate instances. FalkorDB uses port 6379; if you also run a Redis cache locally, use port 6380 to avoid conflicts.

Secrets Management

Secrets follow different patterns per environment:

EnvironmentMethodDetails
Local.env filesChecked into .env.example, actual values gitignored
CIGitHub Actions secretsPer-environment secret sets (staging, production)
Stagingkubectl create secretManual one-time creation during setup
ProductionExternal Secrets OperatorAuto-synced from AWS Secrets Manager every hour

Required secrets are documented in the Deployment Overview.

DNS and TLS

DomainTargetEnvironment
staging.gospelib.comEC2 Elastic IPStaging web
api-staging.gospelib.comEC2 Elastic IPStaging API
gospelib.comCloudFrontProduction web
api.gospelib.comEKS ALBProduction API
admin.gospelib.comCloudFrontProduction admin
grafana.gospelib.comEKS ALBObservability

TLS certificates are managed by cert-manager using Let's Encrypt with HTTP-01 challenges through the nginx ingress class. Certificates auto-renew 30 days before expiry, with an alert firing at 14 days if renewal fails.