Skip to content

Conversation

mcalman
Copy link
Contributor

@mcalman mcalman commented May 21, 2025

This PR implements the required methods to support profiling vLLM with PyTorch Profiler.

Using PyTorch Profiler with vLLM

Offline profiling

Enable torch profiler, can also be set on cmd line:

os.environ["VLLM_TORCH_PROFILER_DIR"] = "./vllm_profile"

Start and stop profiling:

llm.start_profile()
outputs = llm.generate(prompts, sampling_params)
llm.stop_profile()

Online profiling

Start server with profiling enabled:

VLLM_RPC_TIMEOUT=1800000 VLLM_TORCH_PROFILER_DIR=./vllm_profile python3 -m vllm.entrypoints.openai.api_server --model /models/llama-7b-chat --max-model-len=2048 --block-size=128

Start and stop profiler on the client side:

requests.post("http://0.0.0.0:8000/start_profile")
client.completions.create(...)
requests.post("http://0.0.0.0:8000/stop_profile")

Copy link

👋 Hi! Thank you for contributing to vLLM support on Spyre.
Just a reminder: Make sure that your code passes all the linting checks, otherwise your PR won't be able to be merged. To do so, first install the linting requirements, then run format.sh and commit the changes. This can be done with uv directly:

uv pip install --group lint

Or this can be done with pip:

uv pip compile --group lint > requirements-lint.txt
pip install -r requirements-lint.txt
bash format.sh

Now you are good to go 🚀

Copy link

@marceloamaral marceloamaral left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great @mcalman!

Just want to mention that the Spyre events will only be collected if PyTorch is using the Kineto that supports Spyre events.

@marceloamaral
Copy link

@dpatel-ops could you please have a look on this PR?

@joerunde
Copy link
Collaborator

Not gonna lie I'm not an expert in torch profiling.

With the internal package installed I spun this up, profiled it and loaded up the profile in perfetto:
image

Then in tensorboard I see the gpu listed as AIU 🌶️🌶️🌶️
image

Reverting the profiler install and running a new profile I see only cpu devices listed:
image

so, LGTM!

Copy link
Collaborator

@joerunde joerunde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@joerunde joerunde enabled auto-merge (squash) May 23, 2025 22:16
@github-actions github-actions bot added the ready label May 23, 2025
@joerunde joerunde merged commit 0d0d611 into vllm-project:main May 23, 2025
22 checks passed
@mcalman mcalman deleted the profiler branch July 28, 2025 19:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants