-
Notifications
You must be signed in to change notification settings - Fork 7
Support running crossplane-diff from a GitHub Actions container job #254
Description
What problem are you facing?
We're running crossplane-diff xr inside a GitHub Actions container job on a self-hosted runner. We use a container job because we have a custom image with pre-installed tooling (kustomize, flux, kubectl, etc.).
The Docker socket is mounted into the job container via --volume /var/run/docker.sock:/var/run/docker.sock and is confirmed working. we can ping the daemon and query the Docker API from inside the container. However, all compositions fail when crossplane-diff tries to run function runtimes.
We see two categories of errors:
1. Socket timeout (most compositions):
cannot start Function "environment-configs": cannot create Docker container:
Get "http://%2Fvar%2Frun%2Fdocker.sock/_ping": context deadline exceeded
2. gRPC connection refused (when a function container does start):
rpc error: code = DeadlineExceeded desc = latest balancer error:
connection error: desc = "transport: Error while dialing: dial tcp 127.0.0.1:32768: connect: connection refused"
Root cause analysis:
We investigated and believe the core issue is Docker network isolation. GitHub Actions places the job container on a managed bridge network (github_network_<id>, e.g., 172.18.0.0/16). When crossplane-diff creates function containers via the Docker socket, those containers are placed on the default bridge network (172.17.0.0/16) because no NetworkingConfig is passed to the Docker ContainerCreate API call.
This causes two problems:
- The function containers bind their gRPC port to
127.0.0.1on the host, which is unreachable from the job container's network namespace 127.0.0.1inside the job container refers to its own loopback, not the Docker host
We confirmed this by inspecting the job container's network settings:
NetworkMode: github_network_da88fbbc52344b8f91abb107cf23435a
Container IP: 172.18.0.2/16
Gateway: 172.18.0.1
We also confirmed that --network host on the container options resolves the connectivity issue in theory, but GitHub Actions rejects it because it conflicts with the managed github_network_* that GHA creates for the job.
Notably, crossplane render (the Crossplane CLI) works in the exact same container setup for our composition schema validation workflow, so we know the Docker socket mount and daemon are healthy.
How could crossplane-diff help solve this problem?
We traced the issue to the upstream RuntimeDocker.createContainer() in crossplane/crossplane (cmd/crank/render/runtime_docker.go), which passes nil for networkingConfig:
rsp, err := cli.ContainerCreate(ctx, cfg, hcfg, nil, nil, r.Name)A potential solution would be:
-
Upstream (crossplane/crossplane): Add a
render.crossplane.io/runtime-docker-networkannotation that allows specifying which Docker network function containers should join. When set, use the container's IP on that network as the gRPC target instead of the host port binding. -
In crossplane-diff: Read an environment variable (e.g.,
CROSSPLANE_DIFF_DOCKER_NETWORK) and set the annotation on functions before passing them torender.Render()similar to howruntime-docker-nameandruntime-docker-cleanupannotations are already set inCachedFunctionProvider.
This would allow GHA workflows to pass the network:
env:
CROSSPLANE_DIFF_DOCKER_NETWORK: ${{ job.container.network }}Questions:
- Is running
crossplane-difffrom a GHA container job a supported/expected use case? - Would a PR adding Docker network configuration support be welcome, or is there a preferred alternative approach?
- Should the upstream change (
render.crossplane.io/runtime-docker-networkannotation) be proposed oncrossplane/crossplanefirst?