Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

量化模型单卡推理报错 #47

Open
hanyc0914 opened this issue Jun 12, 2023 · 2 comments
Open

量化模型单卡推理报错 #47

hanyc0914 opened this issue Jun 12, 2023 · 2 comments

Comments

@hanyc0914
Copy link

hanyc0914 commented Jun 12, 2023

CUDA_VISIBLE_DEVICES=0 python tigerbot_infer.py ${MODEL_DIR} --wbits 4 --groupsize 128 --load ${MODEL_DIR}/tigerbot-7b-4bit-128g.pt
image

系统信息:
torch 1.13.1+cu117
cuda 11.4
triton 2.0.0.post1
Python 3.10.9

显卡型号:
gp104gl tesla p4

请问可以在cpu下进行量化模型的推理吗?需要如何修改执行命令呢?

@Vivicai1005
Copy link
Contributor

你好,我们命令里面目前还没有兼容cpu的推理,如果想放在cpu上推理,可以在tiger_infer.py代码里将DEV变量改为
DEV = torch.device('cpu') 来试试

@hanyc0914
Copy link
Author

hanyc0914 commented Jun 15, 2023

你好,我们命令里面目前还没有兼容cpu的推理,如果想放在cpu上推理,可以在tiger_infer.py代码里将DEV变量改为 DEV = torch.device('cpu') 来试试

改了DEV和出现.cuda()的地方还是报错:
image
请问第一个在GPU推理时MMA的报错是什么原因吗?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants