[BugFix]add int8 cache dtype when using attention quantization #112
image.yml
on: pull_request
vllm-ascend image
36m 7s
Artifacts
Produced during runtime
Name | Size | |
---|---|---|
vllm-project~vllm-ascend~W1VCGO.dockerbuild
|
175 KB |
|