Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
49 changes: 49 additions & 0 deletions .github/workflows/docker.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
name: Create and publish a Docker image

on:
push:
branches:
- main
# Allow building/pushing on demand (e.g. before the first deploy).
workflow_dispatch:

concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true

jobs:
build-and-push-image:
runs-on: ubuntu-latest
permissions:
contents: read
packages: write
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Log in to GitHub Container Registry
uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Extract metadata (tags, labels) for Docker
id: meta
uses: docker/metadata-action@v5
with:
images: ghcr.io/${{ github.repository }}
tags: |
type=raw,value=latest,enable={{is_default_branch}}
type=sha,prefix=sha-
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Build and push Docker image
uses: docker/build-push-action@v6
with:
context: .
file: Dockerfile
push: true
platforms: 'linux/amd64'
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
cache-to: type=gha,mode=max
cache-from: type=gha
16 changes: 16 additions & 0 deletions .github/workflows/tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -32,3 +32,19 @@ jobs:

- name: Run tests
run: make test

helm:
runs-on: ubuntu-latest

steps:
- name: Check out repository
uses: actions/checkout@v6.0.3

- name: Set up Helm
uses: azure/setup-helm@v4

- name: Lint Helm chart
run: helm lint deploy/helm

- name: Render production values
run: helm template serge deploy/helm -n serge -f deploy/helm/env/prod.yaml
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -16,3 +16,5 @@ dist/
*.db
.serena/
.jekyll-cache/
# Filled-in copy of deploy/helm/serge-secrets.example.yaml — never commit real secrets.
deploy/helm/serge-secrets.yaml
28 changes: 28 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# Production image for the serge web app (reviewbot-web). Mirrors the EC2
# host: python3.11 + bubblewrap (so HELPER_SANDBOX can stay on), the
# package installed into a venv with the [web] extra (FastAPI/uvicorn),
# running uvicorn on $PORT (default 8080) as an unprivileged user. The
# embedded SQLite job store persists on a mounted volume (see chart/).
#
# The sandbox-verification image used for local bwrap testing lives at
# docker/Dockerfile and is unrelated to this one.
FROM python:3.11-slim

RUN apt-get update \
&& apt-get install -y --no-install-recommends bubblewrap ca-certificates git \
&& rm -rf /var/lib/apt/lists/*

# Unprivileged service user, mirroring ec2-user on the real host.
RUN useradd --create-home --shell /bin/bash app

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The app user is created without an explicit UID/GID. The Helm chart hard-codes runAsUser: 1000, runAsGroup: 1000, and fsGroup: 1000 in podSecurityContext. If the image ever builds with app mapped to a different UID (e.g., 1001), volume permissions and process identity will mismatch, likely causing the SQLite store to be unreadable/unwritable.

Suggested change
RUN useradd --create-home --shell /bin/bash app
RUN useradd --create-home --shell /bin/bash --uid 1000 --gid 1000 app


WORKDIR /opt/app
COPY . /opt/app
RUN python -m venv /opt/app/.venv \
&& /opt/app/.venv/bin/pip install --upgrade pip \
&& /opt/app/.venv/bin/pip install -e '.[web]'

ENV PATH="/opt/app/.venv/bin:${PATH}"
ENV PORT=8080
EXPOSE 8080
USER app
CMD ["reviewbot-web"]
83 changes: 83 additions & 0 deletions deploy/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
# Deploying Serge

This directory packages Serge's web app (`reviewbot-web`) for Kubernetes.
The Helm chart is intentionally self-contained and values-driven so a team can
deploy it into its own cluster without changing application code.

## Contents

- `helm/` contains the Helm chart for the web app: Deployment, Service,
ConfigMap, optional Ingress, optional ServiceAccount, and a PersistentVolumeClaim.
- `helm/env/prod.yaml` contains the production values used for
`serge.huggingface.tech` on the open-source EKS cluster.
- `helm/serge-secrets.example.yaml` is a template for the sensitive runtime env.
Copy it to `helm/serge-secrets.yaml`, fill it locally, and never commit it.
- `scripts/deploy.sh` checks the current Kubernetes context, creates the namespace
when needed, optionally applies a local Secret file, and runs Helm.
- `scripts/logs.sh` finds the current running Serge pod and prints recent logs.

## Chart Behavior

Serge uses embedded SQLite for review/task history. The chart therefore runs a
single replica with a `Recreate` rollout strategy and mounts a PVC at
`persistence.mountPath` (`/var/lib/reviewbot` by default). `WEB_STORE_PATH` is
set to `<mountPath>/jobs.db`, so the database survives pod restarts.

The container runs as a non-root user, drops Linux capabilities, uses
`RuntimeDefault` seccomp, and sets `fsGroup` so the app user can write the
volume. Sensitive values are loaded from a pre-created Secret via
`existingSecret`; non-secret runtime config lives in `envVars`.

## Deploy

Create or update the Secret in the target namespace:

```bash
cp deploy/helm/serge-secrets.example.yaml deploy/helm/serge-secrets.yaml
$EDITOR deploy/helm/serge-secrets.yaml
deploy/scripts/deploy.sh -n serge --secret-file deploy/helm/serge-secrets.yaml
```

Deploy without applying a Secret file, assuming `serge-secrets` already exists:

```bash
deploy/scripts/deploy.sh -n serge -f deploy/helm/env/prod.yaml
```

Use `--context` when you want the script to refuse any other kube context:

```bash
deploy/scripts/deploy.sh \
--context infra:opensource-aws-use1-prod-54 \
-n serge \
-f deploy/helm/env/prod.yaml
```

Fetch recent logs:

```bash
deploy/scripts/logs.sh \
--context infra:opensource-aws-use1-prod-54 \
-n serge \
--since 2h \
--grep 'error|traceback|crashed|HTTPError'
```

Print only the latest error block:

```bash
deploy/scripts/logs.sh \
--context infra:opensource-aws-use1-prod-54 \
-n serge \
--since 2h \
--last-error
```

## Notes

- The production image is published to GHCR as `ghcr.io/huggingface/serge`.
- `HELPER_SANDBOX=require` needs nodes that allow unprivileged user namespaces.
Set it to `auto` or `off` in `envVars` if the cluster cannot support that.
- Avoid `kubectl apply` for filled Secret manifests long-term: it can store
plaintext Secret values in the `last-applied-configuration` annotation. The
helper strips that annotation after applying a Secret file.
6 changes: 6 additions & 0 deletions deploy/helm/Chart.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
apiVersion: v2
name: serge
version: 0.1.0
type: application
description: serge — GitHub PR reviewer web app (reviewbot-web)
icon: https://huggingface.co/front/assets/huggingface_logo-noborder.svg
77 changes: 77 additions & 0 deletions deploy/helm/env/prod.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
# Values for the serge deployment on the open-source EKS cluster
# (opensource-aws-use1-prod-54), namespace chosen at install time.
# Deploy: deploy/scripts/deploy.sh -n serge -f deploy/helm/env/prod.yaml
#
# Image is public on GHCR (ghcr.io/huggingface/serge) so no imagePullSecret
# is needed. The ACM cert for serge.huggingface.tech already exists in this
# account (infra 05-eks-utils) and is auto-discovered by the ALB controller
# via the ingress host; external-dns creates the DNS record from the ingress.

image:
# Bump to a newer sha-<commit> (pushed by CI on merge to main) or use latest.
tag: sha-33be7c5

replicas: 1

# Sensitive env: create this Secret in the target namespace beforehand
# (GITHUB_APP_ID, GITHUB_PRIVATE_KEY, GITHUB_WEBHOOK_SECRET,
# GITHUB_OAUTH_CLIENT_ID, GITHUB_OAUTH_CLIENT_SECRET, WEB_SESSION_SECRET,
# LLM_API_KEY, ...).
existingSecret: serge-secrets

persistence:
enabled: true
size: 5Gi
# "" = cluster default StorageClass (EBS gp3 on this cluster).
storageClass: ""

ingress:
enabled: true
host: serge.huggingface.tech
path: /
className: "alb"
annotations:
# Internal ALB: reachable from the internal network / VPN only, NOT the
# public internet. (Note: inbound GitHub App webhooks cannot reach an
# internal ALB — fine for the OAuth web flow, which redirects in the
# user's browser.)
alb.ingress.kubernetes.io/scheme: "internal"
alb.ingress.kubernetes.io/target-type: "ip"
alb.ingress.kubernetes.io/listen-ports: '[{"HTTP": 80}, {"HTTPS": 443}]'
alb.ingress.kubernetes.io/ssl-redirect: "443"
alb.ingress.kubernetes.io/healthcheck-path: "/healthz"
alb.ingress.kubernetes.io/tags: "Env=prod,Project=serge"

# Non-secret runtime config.
envVars:
PORT: "8080"
LOG_LEVEL: "INFO"
LLM_API_BASE: "https://router.huggingface.co/v1"
LLM_MODEL: "moonshotai/Kimi-K2.6"
LLM_STREAM: "1"
LLM_BILL_TO: "huggingface"
# Kimi can spend a substantial part of the completion budget on reasoning.
# Keep enough room for long tool-heavy reviews and the final JSON review.
LLM_MAX_TOKENS: "49152"
TOOL_MAX_ITERATIONS: "30"
TASK_LLM_MAX_TOKENS: "49152"
TASK_TOOL_MAX_ITERATIONS: "30"
MENTION_TRIGGER: "@askserge"
REVIEW_EVENT: "COMMENT"
TASK_API_ENABLED: "1"
# Refuse to run PR-author subprocesses unless bubblewrap is available.
# Requires the node to allow unprivileged user namespaces; if pods fail
# to run reviews, switch to "auto".
HELPER_SANDBOX: "require"
DEV_NO_AUTH: "0"
WEB_ALLOWED_ORG: "huggingface"
GITHUB_OAUTH_CALLBACK_URL: "https://serge.huggingface.tech/auth/callback"
WEB_GITHUB_APP_URL: "https://github.com/apps/sergereview"

resources:
requests:
cpu: 500m
memory: 512Mi
limits:
cpu: "1"
memory: 1Gi
35 changes: 35 additions & 0 deletions deploy/helm/serge-secrets.example.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
# serge-secrets — sensitive env injected into the pod via envFrom (the chart
# references this by name through `existingSecret: serge-secrets`). The keys
# MUST be the exact env var names serge reads.
#
# Usage: copy to serge-secrets.yaml (git-ignored), fill in the values, then:
# deploy/scripts/deploy.sh -n <namespace> --secret-file deploy/helm/serge-secrets.yaml
# Do NOT commit the filled-in file.
apiVersion: v1
kind: Secret
metadata:
name: serge-secrets
# namespace: <namespace> # or pass -n <namespace> to kubectl apply
type: Opaque
stringData:
# --- GitHub App (required: serge publishes reviews via the App) ---
GITHUB_APP_ID: "123456"
# Inline PEM. Paste the full key, indented under the block scalar.
GITHUB_PRIVATE_KEY: |
-----BEGIN RSA PRIVATE KEY-----
REPLACE_WITH_THE_GITHUB_APP_PRIVATE_KEY
-----END RSA PRIVATE KEY-----

# --- Web auth (required when DEV_NO_AUTH=0, which prod uses) ---
GITHUB_OAUTH_CLIENT_ID: "Iv1.xxxxxxxxxxxx"
GITHUB_OAUTH_CLIENT_SECRET: "replace-me"
# Strong random secret: openssl rand -hex 32
WEB_SESSION_SECRET: "replace-me"

# --- LLM (optional in web mode: per-repo keys can live in the DB; set a
# default here if you want one) ---
LLM_API_KEY: "hf_or_sk-..."

# --- Optional: only needed if you wire up inbound GitHub App webhooks.
# An internal ALB can't receive them, so this is usually unused here. ---
# GITHUB_WEBHOOK_SECRET: "replace-me"
14 changes: 14 additions & 0 deletions deploy/helm/templates/_helpers.tpl
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
{{- define "name" -}}
{{- default $.Release.Name | trunc 63 | trimSuffix "-" -}}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

default is called with only one argument here, so it effectively does nothing. The typical pattern is default .Chart.Name .Release.Name | trunc 63 | trimSuffix "-" so the chart name is used as a fallback when the release name is empty.

Suggested change
{{- default $.Release.Name | trunc 63 | trimSuffix "-" -}}
{{- default .Chart.Name .Release.Name | trunc 63 | trimSuffix "-" -}}

{{- end -}}

{{- define "app.name" -}}
serge
{{- end -}}

{{- define "labels.standard" -}}
release: {{ $.Release.Name | quote }}
heritage: {{ $.Release.Service | quote }}
chart: "{{ include "name" . }}"
app: "{{ include "app.name" . }}"
{{- end -}}
10 changes: 10 additions & 0 deletions deploy/helm/templates/config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
apiVersion: v1
kind: ConfigMap
metadata:
labels: {{ include "labels.standard" . | nindent 4 }}
name: {{ include "name" . }}
namespace: {{ .Release.Namespace }}
data:
{{- range $key, $value := $.Values.envVars }}
{{ $key }}: {{ $value | quote }}
{{- end }}
Loading
Loading