Skip to content

RyaxTech/ansible-hpc-cluster

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Intro

Commands are showed as issued from the current directory. Global configurations are on ansible.cfg. To start edit inventory/hosts.yaml to reflect your infrastructure. The user on the inventory must have sudo (without password) access for most of the commands to work. The commands are issued relatively to working directory ansible, cd ansible before starting.

Configuration

  • The configuration expects to have Ubuntu OS and ubuntu user created
  • create private key pairs for the user ubuntu to launch the playbooks (the user needs to have sudo rights)
  • insert your public keys into playbooks/template/authorized_keys
  • change the IPs and hostnames in inventory/hosts.yaml to reflect your cluster
  • change the slurm node names hardware characteristics in playbooks/templates/slurm.conf.j2 to reflect your cluster
  • you may change the slurm version in playbook/slurm.yaml
  • Slurmdbd is configured but not started automatically
  • Follow detailed instructions below regarding the playbooks you need to install. The first 3 playbooks are mandatory for slurm to function correctly. The last 2 playbooks are optional.

Install NFS

Install NFS.

 ansible-playbook ./playbooks/nfs.yaml

Create new user

Create a new user, will create username with same userid accross all nfsclient nodes, then it creates the user home only on the nfs server.

ansible-playbook --extra-vars "username=ryax userid=1044" ./playbooks/add-user.yaml

Install slurm

ansible-playbook ./playbooks/slurm.yaml

Finalize slurmdbd configuration

Add the slurm user in the DB and grant all permissions on the slurm DB using the following commands.

sudo mysql
MariaDB [(none)]>create user 'slurm'@'localhost';
MariaDB [(none)]>grant all on slurm_acct_db.* TO 'slurm'@'localhost';

Uncomment the following lines in your slurm.conf file. Copy the slurm.conf file on all compute nodes.

AccountingStorageHost=localhost
AccountingStoragePort=6819
AccountingStorageType=accounting_storage/slurmdbd

Copy the slurm.conf file on all compute nodes. Restart the slurmd on all nodes. Start slurmdbd and restart slurmctld.

Install singularity

ansible-playbook ./playbooks/singularity.yaml

Install openmpi

ansible-playbook ./playbooks/openmpi.yaml

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published