[BugFix]add int8 cache dtype when using attention quantization #126
image.yml
on: pull_request
vllm-ascend image
31m 58s
Artifacts
Produced during runtime
Name | Size | |
---|---|---|
vllm-project~vllm-ascend~9UYGZ2.dockerbuild
|
125 KB |
|