-
Notifications
You must be signed in to change notification settings - Fork 25
Open
Labels
good first issueGood for newcomersGood for newcomershelp wantedExtra attention is neededExtra attention is needed
Description
Some projects it will be nice to get integrations
- MPI operator https://github.com/kubeflow/mpi-operator
- Slurm on Kube https://github.com/slinkyproject
- Expose affinity mask Numa
- Expose fabric topology https://slurm.schedmd.com/topology.html?
- Kube Ray https://github.com/ray-project/kuberay
- reference [Core] InfiniBand and RDMA support for Ray object store ray-project/ray#30094
- "Ray Job Yaml Deployment Example" with only a few GPUs and NICs and running in parallel
- Kuberay #162
- Jobset https://github.com/kubernetes-sigs/jobset
- Kueue https://github.com/kubernetes-sigs/kueue
- GKE https://cloud.google.com/ai-hypercomputer/docs/create/gke-ai-hypercompute-custom
- Flux operator https://github.com/flux-framework/flux-operator @vsoch
- XPK [Feature] Integrate DraNet AI-Hypercomputer/xpk#482
- NVidia GPU driver https://github.com/NVIDIA/k8s-dra-driver-gpu
- VLLM https://github.com/vllm-project/vllm
- pytorch example https://docs.pytorch.org/docs/stable/distributed.html
- GKE TPUs Trillium
- Kubeflow
gauravkghildiyal, mimowo, mbobrovskyi, siyuanfoundation, michaelasp and 1 more
Metadata
Metadata
Assignees
Labels
good first issueGood for newcomersGood for newcomershelp wantedExtra attention is neededExtra attention is needed