[BugFix]Add int8 cache dtype when using ascend attention quantization #108
image.yml
on: pull_request
vllm-ascend image
30m 20s
Artifacts
Produced during runtime
Name | Size | |
---|---|---|
vllm-project~vllm-ascend~TA1VPH.dockerbuild
|
249 KB |
|