vllm-project · sducouedic · Jul 18, 2025 · Jul 17, 2025 · Jul 17, 2025 · Jul 17, 2025
@@ -18,7 +18,10 @@ nav:
     - Developer Guide:
       - Contributing: contributing/README.md
       - Continuous Batching:
-        - Testing: contributing/continuous_batching/tests.md
+        - Tests:
+          - Output Tests: contributing/continuous_batching/tests/output_tests.md
+          - Scheduler Steps Tests: contributing/continuous_batching/tests/scheduler_steps_tests.md
+          - Other Tests: contributing/continuous_batching/tests/other_tests.md
 
   - Getting Started:
     - Installation: getting_started/installation.md
@@ -34,4 +37,7 @@ nav:
   - Developer Guide:
     - Contributing: contributing/README.md
     - Continuous Batching:
-      - Testing: contributing/continuous_batching/tests.md
+      - Tests:
+        - Output Tests: contributing/continuous_batching/tests/output_tests.md
+        - Scheduler Steps Tests: contributing/continuous_batching/tests/scheduler_steps_tests.md
+        - Other Tests: contributing/continuous_batching/tests/other_tests.md
@@ -55,6 +55,9 @@ For additional features and advanced configurations, refer to the official [MkDo
 
 ## Testing
 
+!!! tip
+    When running tests, if errors occur, these can be analyzed/debugged by setting `DISABLE_ASSERTS = True` in spyre_util.py and by rerunning the test using `pytest --capture=no tests/spyre/test_spyre_basic.py`. After debugging, `DISABLE_ASSERTS` should be reset to `False`.
+
 ### Testing Locally on CPU (No Spyre card)
 
 Optionally, download the `ibm-ai-platform/micro-g3.3-8b-instruct-1b` model:

@@ -0,0 +1,26 @@
+# Other Tests
+
+!!! note
+    Unless otherwise specified, all the continuous batching tests are running with `max_model_len=256`
+
+::: tests.e2e.test_spyre_cb
+    options:
+        show_root_heading: true
+
+::: tests.e2e.test_spyre_async_llm
+    options:
+        show_root_heading: true
+        members:
+        - test_abort
+
+::: tests.e2e.test_spyre_max_new_tokens
+    options:
+        show_root_heading: true
+        members:
+        - test_output
+
+::: tests.e2e.test_spyre_online
+    options:
+        show_root_heading: true
+        members:
+        - test_openai_serving_cb
@@ -0,0 +1,10 @@
+# Output Tests
+
+!!! note
+    Unless otherwise specified, all the continuous batching tests are running with `max_model_len=256`
+
+::: tests.e2e.test_spyre_basic
+    options:
+        members:
+        - test_output
+        - test_batch_handling
@@ -0,0 +1,9 @@
+# Scheduler Steps Tests
+
+!!! note
+    Unless otherwise specified, all the continuous batching tests are running with `max_model_len=256`
+
+!!! warning
+    End output correctness is not verified in those tests (TODO should we? maybe for some of them?)
+
+::: tests.e2e.test_spyre_cb_scheduler_steps
@@ -45,11 +45,12 @@ def test_output(
     The same prompts are also input to HF. The generated output
     including text, token ids, and logprobs, is verified to be
     identical for vLLM and HF.
-
-    If errors occur, these can be analyzed/debugged by setting
-    'DISABLE_ASSERTS = True' in spyre_util.py and by rerunning the
-    test using 'pytest --capture=no tests/spyre/test_spyre_basic.py'
-    After debugging, DISABLE_ASSERTS should be reset to 'False'.
+
+    Configuration for CB - parameters are combinatorial:
+        * max_num_seqs: 4
+        * tensor parallelism: 1, 2, 4, 8
+        * number of prompts: 4 (Chicken soup prompts)
+        * max tokens: 20 (same for all the prompts)
     '''
 
     skip_unsupported_tp_size(tp_size, backend)
@@ -156,7 +157,13 @@ def test_batch_handling(model: str, backend: str, cb: int,
                         monkeypatch: pytest.MonkeyPatch):
     """Test that the spyre worker correctly handles
     continuous batches of requests that
-    finish after different numbers of forward passes"""
+    finish after different numbers of forward passes
+
+    Configuration for CB - parameters are combinatorial:
+        * max_num_seqs: 2
+        * number of prompts: 4 (Chicken soup prompts)
+        * max tokens: [5, 20, 10, 5]
+    """
 
     prompts = get_chicken_soup_prompts(4)