Skip to content

Conversation

@Eta0
Copy link
Collaborator

@Eta0 Eta0 commented Apr 8, 2024

Build NCCL from source

This change switches the installation of libnccl2 and libnccl-dev from apt-get to a source build.
The prebuilt NCCL distributions available for an apt-get install are not updated consistently across all OS version × CUDA version pairs, so this change ensures all of our images can use new NCCL releases as they come out.

This change additionally updates the NCCL version to v2.21.5 on all of our current build targets.

This build supports the following (configurable) list of compute architectures by default: 7.0, 7.5, 8.0, 8.6, 8.9, 9.0+PTX (matching our ml-containers PyTorch builds). This is slightly smaller than the default architecture support list, so the binaries are a bit smaller than the prebuilt distribution as well.

@Eta0 Eta0 added the enhancement New feature or request label Apr 8, 2024
@Eta0 Eta0 requested a review from salanki April 8, 2024 18:52
@Eta0 Eta0 self-assigned this Apr 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants