You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/contributing/continuous_batching/overview.md
+13-11Lines changed: 13 additions & 11 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -39,7 +39,7 @@ For `long_context.py`: the same parameters, but with some differences:
39
39
40
40
* [Output Tests](tests/output_tests.md): Check the correctness of end output logits/tokens of sequences ran with continuous batching enabled
41
41
* [Scheduler Steps Tests](tests/scheduler_steps_tests.md): Check the correctness of the step-by-step execution of continuous batching for different scenarios of prompt lengths and requested tokens
42
-
* [Other Tests](tests/other_tests.md): Other tests verifing the various behaviours of vLLM, when running with continuous batching enabled
42
+
* [Other Tests](tests/other_tests.md): Other tests verifying the various behaviours of vLLM, when running with continuous batching enabled
43
43
44
44
***Purpose:** Automated execution to verify that a specific behaviour acts as expected (passing/failing)
45
45
@@ -48,18 +48,21 @@ For `long_context.py`: the same parameters, but with some differences:
*`-m "spyre and cb"`: runs the tests with configurations marked as "spyre" and "cb" only
70
73
71
74
!!! tip
72
-
To run a test with a different model than the default `ibm-ai-platform/micro-g3.3-8b-instruct-1b`, you can run the test with `VLLM_SPYRE_TEST_MODEL_LIST` environment variable set to the targer model, for example:
75
+
To run a test with a different model than the default `ibm-ai-platform/micro-g3.3-8b-instruct-1b`, you can run the test with `VLLM_SPYRE_TEST_MODEL_LIST` environment variable set to the target model, for example:
@@ -97,7 +100,6 @@ Output tests checks the correctness of the output of CB on a set of prompts. For
97
100
This applies for sendnn backend, on CPU the tokens need to additionally be exactly the same for the test to pass
98
101
* The test passes if: the logprobs of HF on CPU and vLLM (on Spyre or CPU depending on the backend) are compared, and the test passes only if the pairwise relative differences of the values are all below a threshold: `math.isclose(hf_logprob, vllm_logprob, rel_tol=0.35)`. Otherwise it fails. There is no logic that takes into account the fact that the tokens might becomes different at some point, making the logits diverging.
99
102
100
-
101
103
#### Scheduler Steps Tests
102
104
103
105
See [Scheduler Steps Tests](tests/scheduler_steps_tests.md)
0 commit comments