-
Notifications
You must be signed in to change notification settings - Fork 26
Add tensor parallel tests for online server #59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Rafael Vasquez <[email protected]>
👋 Hi! Thank you for contributing to vLLM support on Spyre.
Now you are good to go 🚀 |
Signed-off-by: Rafael Vasquez <[email protected]>
@pytest.mark.parametrize("backend", get_spyre_backend_list()) | ||
@pytest.mark.parametrize("tensor_parallel_size", [2]) | ||
@pytest.mark.parametrize("vllm_version", ["V0", "V1"]) | ||
def test_openai_serving(model, warmup_shape, backend, tensor_parallel_size, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
Any thoughts on moving this over to test_spyre_tensor_parallel.py
? I kinda think we should try to start organizing these files a bit instead of creating more of them. I can also move my basic online tests over to test_spyre_basic.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's cleaner, I just wonder if in the future you could see a situation where online tests would have to be refactored/separated from offline. Up to you
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, online tests are passing/exiting properly without --forked
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's cleaner, I just wonder if in the future you could see a situation where online tests would have to be refactored/separated from offline. Up to you
Yeah... that could end up being the case as well 🤷
We can leave test organization until later
tests/spyre_util.py
Outdated
seed: Optional[int] = 0, | ||
auto_port: bool = True, | ||
max_wait_seconds: Optional[float] = None, | ||
tensor_parallel_size: Optional[int] = 1) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is only used to set args in vllm_serve_args
, can we instead call this like
with RemoteOpenAIServer(
model, ["--tensor-parallel-size", "1"], ...)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the util and the test so that it would be called like you said but still through a pytest parameter.
Signed-off-by: Rafael Vasquez <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🌶️
Adds an online TP case - closes #57.