Skip to content

Conversation

@Andy-Jost
Copy link
Contributor

Summary

CUDA does not support the fork() system call. Forked subprocesses exhibit undefined behavior, including failure to initialize CUDA contexts and devices. This PR adds warnings when IPC objects are serialized with the fork multiprocessing start method.

Changes

  • Added _check_multiprocessing_start_method() function in cuda_utils.pyx that checks if the multiprocessing start method is 'fork' and emits a one-time warning
  • Updated multiprocessing reduction functions to call the warning check:
    • _reduce_allocation_handle in _ipc.pyx
    • _deep_reduce_device_memory_resource in _ipc.pyx
    • _reduce_event in _event.pyx
  • Warning message explains that CUDA doesn't support fork, describes undefined behavior, and recommends using 'spawn' method

Test Coverage

Added comprehensive test suite test_multiprocessing_warning.py with 5 tests:

  • Warning emitted for DeviceMemoryResource pickling with fork method
  • Warning emitted for IPCAllocationHandle pickling with fork method
  • Warning emitted for Event pickling with fork method
  • No warning when start method is 'spawn'
  • Warning emitted only once (one-time check)

Tests run in subprocesses to avoid interference from conftest.py's session fixture that sets spawn method.

Related Work

Fixes #1136

@Andy-Jost Andy-Jost added this to the cuda.core beta 10 milestone Dec 3, 2025
@Andy-Jost Andy-Jost added enhancement Any code-related improvements P0 High priority - Must do! cuda.core Everything related to the cuda.core module labels Dec 3, 2025
@copy-pr-bot
Copy link
Contributor

copy-pr-bot bot commented Dec 3, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@Andy-Jost
Copy link
Contributor Author

/ok to test d1cce9a

@Andy-Jost
Copy link
Contributor Author

/ok to test 3cd8f8a

@copy-pr-bot
Copy link
Contributor

copy-pr-bot bot commented Dec 3, 2025

/ok to test 3cd8f8a

@Andy-Jost, there was an error processing your request: E2

See the following link for more information: https://docs.gha-runners.nvidia.com/cpr/e/2/

@Andy-Jost Andy-Jost force-pushed the warn-fork-multiprocessing branch from d1cce9a to 3cd8f8a Compare December 3, 2025 21:30
@Andy-Jost
Copy link
Contributor Author

/ok to test fcbb370

@Andy-Jost Andy-Jost force-pushed the warn-fork-multiprocessing branch from 3cd8f8a to fcbb370 Compare December 3, 2025 21:37
@Andy-Jost
Copy link
Contributor Author

/ok to test 3d3499a

@Andy-Jost Andy-Jost force-pushed the warn-fork-multiprocessing branch from fcbb370 to 1144c8f Compare December 3, 2025 21:40
@copy-pr-bot
Copy link
Contributor

copy-pr-bot bot commented Dec 3, 2025

/ok to test 3d3499a

@Andy-Jost, there was an error processing your request: E2

See the following link for more information: https://docs.gha-runners.nvidia.com/cpr/e/2/

@Andy-Jost
Copy link
Contributor Author

/ok to test 1144c8f

@github-actions

This comment has been minimized.

@Andy-Jost Andy-Jost force-pushed the warn-fork-multiprocessing branch from 1144c8f to 3d3499a Compare December 3, 2025 23:42
@Andy-Jost
Copy link
Contributor Author

/ok to test 3d3499a

Comment on lines 297 to 299
global _fork_warning_emitted
if _fork_warning_emitted:
return
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of caching it, would it be better to always call multiprocessing.get_start_method() and check it? I worry that it is a global state that we don't own and it could be changed at arbitrary point in time by the user or any package.

Copy link
Contributor Author

@Andy-Jost Andy-Jost Dec 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the official multiprocessing library docs, the description of multiprocessing.set_start_method() explains that:

  • The start method is a global setting that can only be chosen a single time in a given Python process.
  • Calling set_start_method() again without forcing will raise a RuntimeError indicating that the context has already been set.
  • Recent CPython implementations mention an internal/undocumented way to override this with a force argument, but this is not part of the public, documented API and should be avoided in normal code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, Andy. Good to know Python by default checks this. However, I do see the force argument being documented, so it is part of the public API. And it does allow overwriting:

>>> import multiprocessing
>>> multiprocessing.set_start_method("spawn")
>>> multiprocessing.set_start_method("spawn")
Traceback (most recent call last):
  File "<python-input-2>", line 1, in <module>
    multiprocessing.set_start_method("spawn")
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^
  File "/local/home/leof/miniforge3/envs/py313_cu130/lib/python3.13/multiprocessing/context.py", line 247, in set_start_method
    raise RuntimeError('context has already been set')
RuntimeError: context has already been set
>>> multiprocessing.set_start_method("fork", force=True)
>>> multiprocessing.set_start_method("spawn", force=True)
>>> multiprocessing.set_start_method("forkserver", force=True)
>>> 

If this is not on the hot path, I think not caching the result and always checking is safer.

Alternatively, we could mention in the warning message that setting set_start_method(..., force=True) is dangerous because we cannot help capture issues.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more thing, if we already warned the next warn() call is a no-op so adding the _fork_warning_emitted guard is redundant 😆

import warnings

def f():
    warnings.warn("oh no", UserWarning, stacklevel=3)

f()  # only this call would generate a warning 
f()
f()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see, you want to limit the warning to only once per process lifetime, regardless of which object raises the warning from. NVM.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On minor detail -- since set_start_method can only be set once, I don't think we need to check it multiple times. In other words, we can set _fork_warning_emitted = True unconditionally rather than only when a warning is emitted so we don't check again. And maybe renaming the flag to _fork_warning_checked for clarity. Unless I'm missing some case where it might be set exceptionally late and it does need to be checked multiple times.

Copy link
Contributor Author

@Andy-Jost Andy-Jost Dec 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comments. I've made adjustments to the name and conditionality as Mike suggested. I left this as a one-time check because there does not seem to be a strong consensus, but I don't feel strongly about that and wouldn't mind changing it if more discussion goes in that direction.

Incidentally, a separate warning is issued when fork is called in multithreaded programs (like those using CUDA), so if a user somehow sidesteps this warning by setting the start method multiple times, they will still get another nasty message indicating their program is invalid.

CUDA does not support the fork() system call. Forked subprocesses
exhibit undefined behavior, including failure to initialize CUDA
contexts and devices.

Add warning checks in multiprocessing reduction functions for IPC
objects (DeviceMemoryResource, IPCAllocationHandle, Event) that
warn when the start method is 'fork'. The warning is emitted once
per process when IPC objects are serialized.

Fixes NVIDIA#1136
@Andy-Jost
Copy link
Contributor Author

/ok to test 3d3499a

@Andy-Jost Andy-Jost force-pushed the warn-fork-multiprocessing branch from 3d3499a to 9fd6b19 Compare December 4, 2025 00:32
@Andy-Jost
Copy link
Contributor Author

/ok to test 9fd6b19

@Andy-Jost
Copy link
Contributor Author

The previous test methodology of starting a subprocess did not work on the test runners. The latest upload avoids using a subprocess, instead mocking up key methods.

@Andy-Jost Andy-Jost self-assigned this Dec 4, 2025
Change mempool_device to ipc_device fixture for tests that require
IPC-enabled memory resources. The ipc_device fixture properly skips
on Windows where IPC is not supported.
@Andy-Jost
Copy link
Contributor Author

/ok to test 476459f

…t_method

- Add reset_fork_warning() function for testing purposes
- Rename _check_multiprocessing_start_method to check_multiprocessing_start_method (remove leading underscore)
- Update all tests to use reset_fork_warning() instead of directly accessing internal flag
- Fix trailing whitespace
@Andy-Jost
Copy link
Contributor Author

/ok to test fbdd56d

@Andy-Jost Andy-Jost enabled auto-merge (squash) December 4, 2025 14:47
@Andy-Jost
Copy link
Contributor Author

/ok to test 3271964

@Andy-Jost Andy-Jost merged commit c860f3f into NVIDIA:main Dec 4, 2025
61 checks passed
@Andy-Jost Andy-Jost deleted the warn-fork-multiprocessing branch December 4, 2025 15:48
@github-actions
Copy link

github-actions bot commented Dec 4, 2025

Doc Preview CI
Preview removed because the pull request was closed or merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cuda.core Everything related to the cuda.core module enhancement Any code-related improvements P0 High priority - Must do!

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEA]: Warn when the multiprocessing start method is "fork"

3 participants