This project demonstrates how to use the kafka-slurm-agent distributed streaming engine.
In this example this engine is used to run the knot detection software topoly to detect knots using the homfly polynomial on protein structures downloaded from the AlphaFold database.
Use the standard pip
tool to install. The recommended way is to use a Python virtual environment:
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
Please adjust the config file.
- Modify the configuration of the connection to Apache Kafka. The default one assumes that kafka is running on localhost and default port (9092) and doesn't use authentication or SSL.
In the comments you will find parameters necessary to connect to Kafka configured using SASL and plaintext password. If you use this type of connection please uncomment also the line that starts with:
# KAFKA_FAUST_BROKER_CREDENTIALS
- Make sure that
PREFIX
points to the location of your project - Change the names of topics used for your project to avoid any conflict with projects sharing the same kafka instance.
- If you want to use a SLURM cluster please change the job
CLUSTER_JOB_NAME_SUFFIX = '_KSA'
to avoid conflicts with other projects running on your slurm cluster. The jobs managed by cluster-agent will be named "JOBID_SUFFIX" where the JOBID is the identifier that you assign when submitting a job and SUFFIX is handled by this configuration parameter.
- Open a new terminal and start the worker-agent (
./start_worker_agent
) - Open a new terminal and start the monitor-agent (
./start_monitor_agent
) - Submit new jobs using the
submitter.py
- While the submitted jobs are running you can monitor the statuses at: http://localhost:6067/mon/stats/ on the host on which you've started the monitor-agent.
- The output should be visible on the console of the monitor-agent.