🐧️ I have greatly been enjoying my time using Flux!
This workflow includes:
- A Flux container defined in the Dockerfile with a rabbit client.
- A rabbit service defined in the docker-compose.yml to get it running alongside manually specified nodes.
This example is different from the replicas one that uses deploy->replicas instead of a manually derived listing. The reason we need to do this is because with the first approach, we cannot give the containers reliable host names. For this approach we can!
The rabbit service is just provided as an example that you can have a service alongside your cluster. Since we are doing this, we install pika and provide a scripts/job.py to submit to Flux that mostly does silly things.
You can first build the images (used for workers and broker):
docker compose buildNote this will trigger three builds, so be careful! If you want to build just once, you can build the replicas example example first that will derive the same layers (and they will then be reused).
Then bring them up! You'll see the rabbit image pull if you don't have it already. Since we are starting a cluster, it's recommended to start in detached mode:
docker compose up -dYou can then see containers running:
docker compose psNAME COMMAND SERVICE STATUS PORTS
basic-node-1 "/bin/sh -c '/bin/ba…" node running
basic-node-2 "/bin/sh -c '/bin/ba…" node running
basic-node-3 "/bin/sh -c '/bin/ba…" node running
basic-node-4 "/bin/sh -c '/bin/ba…" node running
rabbitmq "docker-entrypoint.s…" rabbit running 4369/tcp, 5671/tcp, 0.0.0.0:5672->5672/tcp, 15671/tcp, 15691-15692/tcp, 25672/tcp, 0.0.0.0:15672->15672/tcpSince we have defined the containers separately, to look at logs we can target any individual one:
docker compose logs node-1
docker compose logs node-2
docker compose logs node-3
docker compose logs node-4Add an -f to keep the log open.
Now you can shell in to interact with your cluster (shelling into the main broker below):
docker exec -it node-1 bashAnd flux should be up and running - you can submit jobs, etc.
fluxuser@a414059fd5a8:~$ flux resource list
STATE NNODES NCORES NGPUS NODELIST
free 4 4 0 node-[1-4]
allocated 0 0 0
down 0 0 0
$ flux overlay status
0 basic-node-1: full
├─ 1 node-2: full
├─ 2 node-3: full
└─ 3 node-4: fullAt this point, let's try communicating with rabbit. The dummy credentials are hard coded in our example script (I know, I'm a terrible person):
python3 job.py👋️ Sent 'Hello World!'Now try running with Flux:
flux run python3 job.py
flux submit python3 job.py
ƒVL9T1RZAnd get the logs:
flux job attach $(flux job last)
👋️ Sent 'Hello World!'And that should be enough to get you started with your (much cooler) workflows. Have fun!
Make sure to stop and remove containers!
docker compose stop
docker compose rmYou might want to use the setup here as an example of how to configure a cluster, and while some of it is OK, this is an overly simplified version and you should generally consult the Flux docs admin guide. However, there are some tweaks we do here just for docker-compose you should know about!
- We need to tell the broker.toml to load the
noverifyplugin under theresourcedirective. The reason we have to do this is because docker compose provides hostnames as the container identifiers, and Flux checks this against the resources defined. By addingnoverifywe skip this check. checks this against the resources defined. By addingnoverifywe skip this check. - We derive and export
FLUX_FAKE_HOSTNAMEto coincide with the name provided by docker so the hosts register.
This is a fairly new setup, so please let us know if you run into issues.
You generally will want to install your software of choice into the Dockerfile, add any additional services needed, and then shell inside to interact with Flux and test your scripts! If you need to add volumes, that can work too.
To change the number of workers, since assets need to be built that require sudo (during build) we require setting workers. You can change this in the header of the docker-compose.yml
# Shared number of replicas (workers) for build and runtime
# This does not include the broker
x-shared-workers
&workers
replicas: 3Note that we used "deploy -> replicas" to scale the workers. This means we currently don't have nice
control of hostname, so it defaults to "directory-container" which is basic-node-<number>.
docker-compose is a monster and starts counting at 1, so generally basic-node-1 is the broker,
and basic-node-N is in reference to a worker.
Docker compose derives the hostname from the directory, so if you move the directory you'll need to tweak a few places:
- The
flux R encodecommand in the Dockerfile - The reference to hosts in the broker.toml
- The reference for the mainHost (that populates the entrypoint) in the docker-compose.yml
It also doesn't hurt to do a grep for "basic" if you think you missed one!
The rabbit logs are currently set to write in the container, as we cannot surprise the user to write to their filesystem. Note that you can add volumes if you want to save them locally, and it's recommended to write to the local directory and not a system volume.