Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 8 additions & 2 deletions docs/.nav.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,10 @@ nav:
- Developer Guide:
- Contributing: contributing/README.md
- Continuous Batching:
- Testing: contributing/continuous_batching/tests.md
- Tests:
- Output Tests: contributing/continuous_batching/tests/output_tests.md
- Scheduler Steps Tests: contributing/continuous_batching/tests/scheduler_steps_tests.md
- Other Tests: contributing/continuous_batching/tests/other_tests.md

- Getting Started:
- Installation: getting_started/installation.md
Expand All @@ -34,4 +37,7 @@ nav:
- Developer Guide:
- Contributing: contributing/README.md
- Continuous Batching:
- Testing: contributing/continuous_batching/tests.md
- Tests:
- Output Tests: contributing/continuous_batching/tests/output_tests.md
- Scheduler Steps Tests: contributing/continuous_batching/tests/scheduler_steps_tests.md
- Other Tests: contributing/continuous_batching/tests/other_tests.md
3 changes: 3 additions & 0 deletions docs/contributing/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,9 @@ For additional features and advanced configurations, refer to the official [MkDo

## Testing

!!! tip
When running tests, if errors occur, these can be analyzed/debugged by setting `DISABLE_ASSERTS = True` in spyre_util.py and by rerunning the test using `pytest --capture=no tests/spyre/test_spyre_basic.py`. After debugging, `DISABLE_ASSERTS` should be reset to `False`.

### Testing Locally on CPU (No Spyre card)

Optionally, download the `ibm-ai-platform/micro-g3.3-8b-instruct-1b` model:
Expand Down
3 changes: 0 additions & 3 deletions docs/contributing/continuous_batching/tests.md

This file was deleted.

26 changes: 26 additions & 0 deletions docs/contributing/continuous_batching/tests/other_tests.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# Other Tests

!!! note
Unless otherwise specified, all the continuous batching tests are running with `max_model_len=256`

::: tests.e2e.test_spyre_cb
options:
show_root_heading: true

::: tests.e2e.test_spyre_async_llm
options:
show_root_heading: true
members:
- test_abort

::: tests.e2e.test_spyre_max_new_tokens
options:
show_root_heading: true
members:
- test_output

::: tests.e2e.test_spyre_online
options:
show_root_heading: true
members:
- test_openai_serving_cb
10 changes: 10 additions & 0 deletions docs/contributing/continuous_batching/tests/output_tests.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
# Output Tests

!!! note
Unless otherwise specified, all the continuous batching tests are running with `max_model_len=256`

::: tests.e2e.test_spyre_basic
options:
members:
- test_output
- test_batch_handling
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# Scheduler Steps Tests

!!! note
Unless otherwise specified, all the continuous batching tests are running with `max_model_len=256`

!!! warning
End output correctness is not verified in those tests (TODO should we? maybe for some of them?)

::: tests.e2e.test_spyre_cb_scheduler_steps
19 changes: 13 additions & 6 deletions tests/e2e/test_spyre_basic.py
Original file line number Diff line number Diff line change
Expand Up @@ -45,11 +45,12 @@ def test_output(
The same prompts are also input to HF. The generated output
including text, token ids, and logprobs, is verified to be
identical for vLLM and HF.

If errors occur, these can be analyzed/debugged by setting
'DISABLE_ASSERTS = True' in spyre_util.py and by rerunning the
test using 'pytest --capture=no tests/spyre/test_spyre_basic.py'
After debugging, DISABLE_ASSERTS should be reset to 'False'.

Configuration for CB - parameters are combinatorial:
* max_num_seqs: 4
* tensor parallelism: 1, 2, 4, 8
* number of prompts: 4 (Chicken soup prompts)
* max tokens: 20 (same for all the prompts)
'''

skip_unsupported_tp_size(tp_size, backend)
Expand Down Expand Up @@ -156,7 +157,13 @@ def test_batch_handling(model: str, backend: str, cb: int,
monkeypatch: pytest.MonkeyPatch):
"""Test that the spyre worker correctly handles
continuous batches of requests that
finish after different numbers of forward passes"""
finish after different numbers of forward passes

Configuration for CB - parameters are combinatorial:
* max_num_seqs: 2
* number of prompts: 4 (Chicken soup prompts)
* max tokens: [5, 20, 10, 5]
"""

prompts = get_chicken_soup_prompts(4)

Expand Down
Loading