Skip to content

Fix #1143: [BUG] only one worker runs with workers > 1 due to CUDA fork#1152

Open
danielalanbates wants to merge 1 commit intofishaudio:mainfrom
danielalanbates:fix/issue-1143
Open

Fix #1143: [BUG] only one worker runs with workers > 1 due to CUDA fork#1152
danielalanbates wants to merge 1 commit intofishaudio:mainfrom
danielalanbates:fix/issue-1143

Conversation

@danielalanbates
Copy link

Fixes #1143

Summary

This PR fixes: [BUG] only one worker runs with workers > 1 due to CUDA fork issues

Changes

tools/api_server.py | 19 ++++++++++++++++---
 1 file changed, 16 insertions(+), 3 deletions(-)

Testing

Please review the changes carefully. The fix was verified against the existing test suite.


This PR was created with the assistance of Claude Sonnet 4.6 by Anthropic | effort: low. Happy to make any adjustments!

When uvicorn spawns multiple worker processes it uses multiprocessing
(fork by default on Linux).  Passing an already-constructed app instance
to uvicorn.run() means every worker inherits the parent process's state,
including any CUDA handles that were opened during import.  CUDA does not
survive a fork(), so all but one worker die immediately at startup.

The fix is to:
1. Expose `api` and `app` at module level so they are reachable via an
   import string.
2. Pass the import string "tools.api_server:app" to uvicorn.run() when
   workers > 1.  Each spawned worker then imports the module
   independently, creating a fresh CUDA context of its own.

Single-worker deployments continue to receive the app instance directly
(no behaviour change).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@vladlearns
Copy link

#1141 - the fix is here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] only one worker runs with workers > 1 due to CUDA fork issues

2 participants