Skip to content

Conversation

mcalman
Copy link
Contributor

@mcalman mcalman commented Oct 10, 2025

Description

  1. Switches to an alternative method for enabling AIU event profiling when profiling is enabled with VLLM_TORCH_PROFILER_DIR. This resolves the following error that occurs when trying to profile a multi-AIU workload: RuntimeError: Please register PrivateUse1HooksInterface by 'RegisterPrivateUse1HooksInterface' first.
  2. Adds support for the following profiling options that were added to vLLM: https://github.com/vllm-project/vllm/blob/30f78af147cb9eac0a5c643a69882a3b0e45f986/docs/contributing/profiling.md?plain=1#L10-L13

Related Issues

n/a

Copy link

👋 Hi! Thank you for contributing to vLLM support on Spyre.
Just a reminder: Make sure that your code passes all the linting checks, otherwise your PR won't be able to be merged. To do so, first install the linting requirements, then run format.sh and commit the changes. This can be done with uv directly:

uv sync --frozen --group lint --active --inexact

Or this can be done with pip:

uv pip compile --group lint > requirements-lint.txt
pip install -r requirements-lint.txt
bash format.sh

Now you are good to go 🚀

autopilot_opt = options.get(
"autopilot", "1") # autopilot defaults to 1 if not set

if envs_spyre.VLLM_SPYRE_AUTOPILOT_OFF:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the use case for setting VLLM_SPYRE_AUTOPILOT_OFF instead of removing autopilot from DT_OPT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is provided for convenience and to make user's aware the option to turn autopilot off exists and is often needed for informative profiling. It is functionally equivalent to the user setting DT_OPT with the desired autopilot value themselves.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I agree that the DT_OPT environment variable is not very convenient, but I think that creating these interactions between variables could make things more confusing for the user. What if instead whenever envs.VLLM_TORCH_PROFILER_DIR is set and the autopilot option in DT_OPT is not set a warning is printend? In this way the user would be made aware as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maxdebayser thanks for the suggestion. I've made this change

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants