A production-grade Go microservice designed for zero-downtime Kubernetes deployments, automated CI/CD, and high-security standards.
This service was engineered for high-availability deployment on Google Kubernetes Engine (GKE Autopilot):
- β Zero-Downtime Deployments - Rolling updates with health probes and graceful shutdown
- β Multi-Environment Architecture - Isolated staging and production namespaces
- β Automated CI/CD Pipeline - Quality gates, security scanning, progressive deployment
- β Cloud-Native Design - GKE Autopilot-ready with managed PostgreSQL support
- β Observability - Structured JSON logging with request tracing
- β Docker Optimization - 98% size reduction (1.87GB β 36MB)
Deployment Architecture:
- π Production:
http://<PRODUCTION_IP>(Health Check) - π§ Staging:
http://<STAGING_IP>(Health Check)
Note: The live deployment on GKE has been paused to minimize cloud costs. The infrastructure code remains fully functional and can be redeployed at any time.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β GitHub Actions CI/CD β
β |
β Lint -> Formatting β Security Scan β Tests β Build β Deployβ
βββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββββββ
β
β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Google Kubernetes Engine (GKE) β
β β
β ββββββββββββββββββββ ββββββββββββββββββββ β
β β Staging Namespaceβ βProduction Namespaceβ β
β β β β β β
β β Load Balancer β β Load Balancer β β
β β <STAGING_IP> β β <PRODUCTION_IP> β β
β β β β β β β β
β β Service (LB) β β Service (LB) β β
β β β β β β β β
β β Deployment β β Deployment β β
β β β’ 2 Replicas β β β’ 3 Replicas β β
β β β’ Health Probes β β β’ Health Probes β β
β β β’ Auto Rollback β β β’ Zero Downtime β β
β ββββββββββ¬ββββββββββ ββββββββββ¬ββββββββββ β
β β β β
βββββββββββββΌββββββββββββββββββββββββββΌβββββββββββββββββββ
β β
β β
βββββββββββββββββ βββββββββββββββββ
β Neon Staging β β Neon Productionβ
β PostgreSQL β β PostgreSQL β
β (SSL/TLS) β β (SSL/TLS) β
βββββββββββββββββ βββββββββββββββββ
- Rolling updates with
maxUnavailable: 0in production - Health probes prevent traffic to unhealthy pods (liveness + readiness)
- Graceful shutdown with configurable preStop hooks
- Automated smoke tests validate deployments before traffic routing
- Instant rollback on deployment failure
- Parallel quality gates: Linting, security scanning (gosec), unit tests
- Progressive deployment: Staging (automatic) β Production (manual approval)
- Immutable deployments: SHA-tagged Docker images for traceability
- Environment isolation: Separate namespaces, configs, and databases
- Automated rollback: Failed deployments revert automatically
- GKE Autopilot: Fully managed Kubernetes with auto-scaling
- Managed Database: Neon PostgreSQL with SSL/TLS encryption
- Secret Management: Kubernetes secrets (not hardcoded credentials)
- Resource Optimization: CPU/memory limits prevent resource exhaustion
- Security Hardening: Non-root containers, minimal attack surface
- Structured JSON logging with automatic log levels (INFO/WARN/ERROR)
- Request tracing via unique
X-Request-IDheaders - Health endpoints:
/health(liveness) and/ready(readiness) - Prometheus-ready: Annotations for metrics scraping
- Latency tracking: Automatic request duration logging
- 36MB final image (98% reduction from naive 1.87GB build)
- Multi-stage distroless build for minimal attack surface
- Static binary compilation (no runtime dependencies)
- Security: Runs as non-root user (uid 65532)
- Build caching: Optimized layer structure for fast rebuilds
| Component | Technology | Purpose |
|---|---|---|
| Language | Go 1.25 | High-performance backend |
| Web Framework | Gin | HTTP routing and middleware |
| Database | PostgreSQL (Neon) | Managed, serverless SQL database |
| Container | Docker (Distroless) | Minimal, secure runtime |
| Orchestration | Kubernetes (GKE Autopilot) | Zero-downtime deployments |
| CI/CD | GitHub Actions | Automated testing and deployment |
| Logging | log/slog | Structured JSON logging |
| Registry | GitHub Container Registry (GHCR) | Docker image storage |
# Clone the repository
git clone https://github.com/fahadAziz44/zero-downtime-go-api.git
cd zero-downtime-go-api
# Start database and application with Docker Compose
docker-compose up --build
# Run database migrations (in another terminal)
make migrate-up
# The API will be available at http://localhost:8080# Replace <PRODUCTION_IP> with your actual GKE Load Balancer IP
# Production health check
curl http://<PRODUCTION_IP>/health
# List all users
curl http://<PRODUCTION_IP>/api/v1/users
# Create a user
curl -X POST http://<PRODUCTION_IP>/api/v1/users \
-H "Content-Type: application/json" \
-d '{
"username": "johndoe",
"email": "[email protected]",
"full_name": "John Doe"
}'
# Get user by username
curl http://<PRODUCTION_IP>/api/v1/users/username/johndoe# Run all validation checks (lint, security, tests)
make validate
# Run tests with coverage
make coverage
# Build Docker image locally
make docker-build
# Run application locally (DB in Docker)
make db # Start PostgreSQL container
make migrate-up # Run migrations
make run # Start Go applicationBase URL:
- Local:
http://localhost:8080/api/v1 - Production:
http://<PRODUCTION_IP>/api/v1(replace with your GKE Load Balancer IP)
| Method | Endpoint | Description |
|---|---|---|
| GET | /health |
Liveness probe (Kubernetes) |
| GET | /ready |
Readiness probe (database connectivity) |
| GET | /users |
List all users |
| GET | /users/username/:username |
Get user by username |
| GET | /users/id/:id |
Get user by UUID |
| POST | /users |
Create new user |
| PATCH | /users/id/:id |
Update user by UUID |
| DELETE | /users/id/:id |
Delete user by UUID |
Example Request:
curl -X POST http://localhost:8080/api/v1/users \
-H "Content-Type: application/json" \
-d '{
"username": "alice",
"email": "[email protected]",
"full_name": "Alice Johnson"
}'Note: Replace localhost:8080 with your deployment URL when running in Kubernetes.
Example Response:
{
"id": "123e4567-e89b-12d3-a456-426614174000",
"username": "alice",
"email": "[email protected]",
"full_name": "Alice Johnson",
"created_at": "2024-11-19T10:30:00Z",
"updated_at": "2024-11-19T10:30:00Z"
}- UUID-based primary keys (prevents enumeration attacks)
- SQL injection prevention (parameterized queries)
- Input validation (username, email, full_name constraints)
- Non-root containers (uid 65532, dropped capabilities)
- Minimal Docker images (distroless, no shell, no package manager)
- TLS/SSL database connections (Neon PostgreSQL requires encryption)
- Optional X-API-Key authentication (header-based access control)
- Secret management (Kubernetes secrets, not hardcoded)
Triggers: All branches and pull requests Duration: ~45 seconds (parallel execution)
Lint (golangci-lint) βββ
|
|
Code formatting (gofmt)|
βββ> Quality Gate
Security Scan (gosec) ββ€
β
Unit Tests βββββββββββββ€
β
Build Verification βββββ
Triggers: Push to master branch
Duration: ~10-15 minutes
Build & Push Docker Image (SHA-tagged)
β
Deploy to Staging (automatic)
β’ Update image with SHA tag
β’ Rolling update (2 replicas)
β’ Smoke tests (/health, /ready)
β
Deploy to Production (manual approval required)
β’ Update image with SHA tag
β’ Zero-downtime rolling update (3 replicas)
β’ Smoke tests (/health, /ready)
β’ Auto-rollback on failure
Note: Deployments to GKE are currently disabled. The deployment code remains visible to demonstrate CI/CD practices. To enable deployments, see the Enabling Deployments section in kubernetes/README_GKE.md.
Key Features:
- Immutable deployments: Every commit creates a unique SHA-tagged image
- Progressive rollout: Staging validates changes before production
- Automated validation: Health checks prevent bad deployments
- Traceability: Know exactly which commit is running in each environment
| Setting | Staging | Production |
|---|---|---|
| Replicas | 2 | 3 |
| Downtime Tolerance | 50% (1 pod) | 0% (zero-downtime) |
| Memory | 128Mi-256Mi | 256Mi-512Mi |
| CPU | 100m-500m | 250m-1000m |
| Database | Neon dev branch | Neon production branch |
| Deployment | Automatic | Manual approval |
Production deployment strategy:
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1 # Create 1 extra pod during update
maxUnavailable: 0 # Never drop below 3 running podsHealth probes prevent bad deployments:
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 15
periodSeconds: 20
failureThreshold: 3
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 10
periodSeconds: 10
failureThreshold: 3Graceful shutdown prevents connection drops:
lifecycle:
preStop:
exec:
command: ["/bin/sh", "-c", "sleep 10"] # Drain connectionsComprehensive documentation is available in the docs/ directory:
- ARCHITECTURE.md - System design and zero-downtime deployment strategy
- RUNBOOK.md - Operational guide for deployments and troubleshooting
- DOCKER_SIZE_OPTIMIZATION.md - Docker optimization journey (1.87GB β 36MB)
- JSON_LOGGING_IMPLEMENTATION.md - Structured logging design
- Kubernetes Deployment Guide - Kubernetes manifest documentation
- Project Origin - How this project evolved from an assignment
# Run all unit tests
make test
# Run tests with coverage report
make coverage
# Generate HTML coverage report
make coverage-html
open coverage.htmlTest Coverage: Service layer has comprehensive unit tests following the Given-When-Then pattern.
Example test:
// Given: A valid user exists in the repository
func TestGetByID_Success(t *testing.T) {
// When: Fetching user by ID
user, err := service.GetByID(ctx, validID)
// Then: User is returned without error
assert.NoError(t, err)
assert.Equal(t, "johndoe", user.Username)
}# Test local deployment
./test-api.sh http://localhost:8080
# Test production deployment (replace with your IP)
./test-api.sh http://<PRODUCTION_IP>
# Keep test data for debugging
./test-api.sh --no-cleanup
---
## βοΈ **Configuration**
The application uses **environment-based configuration** with validation and fail-fast behavior.
**Required Environment Variables:**
```bash
POSTGRES_USER=your_user
POSTGRES_PASSWORD=your_passwordOptional Environment Variables (with defaults):
POSTGRES_HOST=localhost # Database host
POSTGRES_PORT=5432 # Database port
POSTGRES_DB=cruder # Database name
POSTGRES_SSL_MODE=disable # SSL mode (use 'require' in production)
PORT=8080 # Application port
API_KEY= # Optional API key for authenticationDevelopment Setup:
- Copy
.env.exampleto.env - Update
POSTGRES_USERandPOSTGRES_PASSWORD - Start the application with
docker-compose up
Production Setup:
- Configuration is managed via Kubernetes ConfigMaps and Secrets
- Sensitive credentials (database password, API keys) are stored in Kubernetes Secrets
- Non-sensitive config (database host, port) is stored in ConfigMaps
The API supports optional X-API-Key authentication:
Enable authentication:
# Add to .env file
API_KEY=your-secret-key-hereMake authenticated requests:
curl -H "X-API-Key: your-secret-key-here" \
http://localhost:8080/api/v1/usersResponses:
- β Valid key β Request proceeds
- β Missing header β
401 Unauthorized - β Wrong key β
403 Forbidden
Development mode: Leave API_KEY unset to disable authentication during local development.
Potential improvements to make this even more production-ready:
- HTTPS/TLS - SSL certificates for secure communication
- Rate Limiting - Protect API from abuse (currently implemented at LB level via Cloud Armor)
- Monitoring - Prometheus/Grafana dashboards with alerts
- Terraform - Infrastructure as Code for GKE and Neon
- Integration Tests - End-to-end API validation in CI/CD
- Database Backups - Automated backup and restore procedures
- API Documentation - Swagger/OpenAPI specification
- JWT Authentication - Per-user authentication (currently using API key)
- Pagination - Handle large datasets efficiently
- Feature Flags - Gradual rollouts and safe feature deployment
This project is licensed under the MIT License - see the LICENSE file for details.
This project evolved from a technical assessment into a comprehensive exploration of production-grade backend architecture. It represents the type of system I'd build for real-world use, with all the operational considerations that come with running services in production.
Built by: Fahad Aziz GitHub: @fahadAziz44
β If you find this useful, please consider giving it a star!