Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to run benchmark scripts against the endpoint #783

Open
Jeffwan opened this issue Mar 3, 2025 · 1 comment
Open

Failed to run benchmark scripts against the endpoint #783

Jeffwan opened this issue Mar 3, 2025 · 1 comment
Assignees
Labels
area/gateway kind/bug Something isn't working priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now.

Comments

@Jeffwan
Copy link
Collaborator

Jeffwan commented Mar 3, 2025

🐛 Describe the bug

python3 benchmark_serving.py --backend vllm  --model deepseek-ai/deepseek-r1 --trust-remote-code --served-model-name deepseek-r1-671b --base-url http://localhost:8888 --endpoint /v1/completions --num-prompts 100 --request-rate 2 --metric_percentiles '50,90,95,99' --goodput ttft:1000 tpot:100 --max-concurrency 200 --random-input-len 2048 --random-output-len 200 --dataset-name random --ignore-eos 
Starting initial single prompt test run...
RequestFuncOutput(generated_text='', success=False, latency=0.0, output_tokens=0, ttft=0.0, itl=[], tpot=0.0, prompt_len=2048, error='Bad Request')
Traceback (most recent call last):
  File "/Users/bytedance/workspace/vllm/benchmarks/benchmark_serving.py", line 1315, in <module>
    main(args)
  File "/Users/bytedance/workspace/vllm/benchmarks/benchmark_serving.py", line 951, in main
    benchmark_result = asyncio.run(
  File "/Users/bytedance/.pyenv/versions/3.10.10/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/Users/bytedance/.pyenv/versions/3.10.10/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
    return future.result()
  File "/Users/bytedance/workspace/vllm/benchmarks/benchmark_serving.py", line 602, in benchmark
    raise ValueError(
ValueError: Initial test run failed - Please make sure benchmark arguments are correctly specified. Error: Bad Request

gateway logs

I0303 00:53:49.475583       1 gateway.go:221]
I0303 00:53:49.475604       1 gateway.go:222] "-- In RequestHeaders processing ..." requestID="4cb7758a-aa7c-49e5-a6d0-8243aba62a19"
I0303 00:53:49.475949       1 gateway.go:287] "-- In RequestBody processing ..." requestID="4cb7758a-aa7c-49e5-a6d0-8243aba62a19"
I0303 00:53:49.476224       1 gateway.go:388] "request start" requestID="4cb7758a-aa7c-49e5-a6d0-8243aba62a19" model="deepseek-r1-671b" routingStrategy="random" targetPodIP="192.168.0.74:8000"
I0303 00:53:49.477602       1 gateway.go:407] "-- In ResponseHeaders processing ..." requestID="4cb7758a-aa7c-49e5-a6d0-8243aba62a19"
I0303 00:53:49.477827       1 gateway.go:440] "-- In ResponseBody processing ..." requestID="4cb7758a-aa7c-49e5-a6d0-8243aba62a19" endOfSteam=false
I0303 00:53:49.477858       1 gateway.go:440] "-- In ResponseBody processing ..." requestID="4cb7758a-aa7c-49e5-a6d0-8243aba62a19" endOfSteam=false
I0303 00:53:49.477869       1 gateway.go:440] "-- In ResponseBody processing ..." requestID="4cb7758a-aa7c-49e5-a6d0-8243aba62a19" endOfSteam=true

192.168.0.74 is the head pod but not request is coming into engine side. could be streaming issue?

Steps to Reproduce

deepseek-r1.yaml

Expected behavior

benchmark should work as expected

Environment

0.2.0

@gau-nernst
Copy link

Might be related to #757. I discovered that issue when using SGLang's bench_serving, which should be quite similar to vLLM benchmark_serving

@Jeffwan Jeffwan added area/gateway kind/bug Something isn't working priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. labels Mar 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/gateway kind/bug Something isn't working priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now.
Projects
None yet
Development

No branches or pull requests

3 participants