We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CUDA_VISIBLE_DEVICES=0 python tigerbot_infer.py ${MODEL_DIR} --wbits 4 --groupsize 128 --load ${MODEL_DIR}/tigerbot-7b-4bit-128g.pt
系统信息: torch 1.13.1+cu117 cuda 11.4 triton 2.0.0.post1 Python 3.10.9
显卡型号: gp104gl tesla p4
请问可以在cpu下进行量化模型的推理吗?需要如何修改执行命令呢?
The text was updated successfully, but these errors were encountered:
你好,我们命令里面目前还没有兼容cpu的推理,如果想放在cpu上推理,可以在tiger_infer.py代码里将DEV变量改为 DEV = torch.device('cpu') 来试试
Sorry, something went wrong.
改了DEV和出现.cuda()的地方还是报错: 请问第一个在GPU推理时MMA的报错是什么原因吗?
No branches or pull requests
CUDA_VISIBLE_DEVICES=0 python tigerbot_infer.py ${MODEL_DIR} --wbits 4 --groupsize 128 --load ${MODEL_DIR}/tigerbot-7b-4bit-128g.pt

系统信息:
torch 1.13.1+cu117
cuda 11.4
triton 2.0.0.post1
Python 3.10.9
显卡型号:
gp104gl tesla p4
请问可以在cpu下进行量化模型的推理吗?需要如何修改执行命令呢?
The text was updated successfully, but these errors were encountered: