[BugFix]Add int8 cache dtype when using ascend attention quantization #109
image.yml
on: pull_request
vllm-ascend image
27m 37s
Artifacts
Produced during runtime
Name | Size | |
---|---|---|
vllm-project~vllm-ascend~XZ4DJU.dockerbuild
|
244 KB |
|