Skip to content

Commit

Permalink
update readme
Browse files Browse the repository at this point in the history
  • Loading branch information
yvonwin committed May 8, 2024
1 parent 4ebdf55 commit 0ddd87b
Show file tree
Hide file tree
Showing 2 changed files with 2 additions and 6 deletions.
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,7 @@ The original model (`-i <model_name_or_path>`) can be a HuggingFace model name o
* Qwen1.5-72B: `Qwen/Qwen1.5-32B-Chat`
* Qwen1.5-MoeA2.7B: `Qwen/Qwen1.5-MoE-A2.7B-Chat`
* Llama-3-8B-Instruct: `meta-llama/Meta-Llama-3-8B-Instruct`
* Llama3-8B-Chinese-Chat : `shenzhi-wang/Llama3-8B-Chinese-Chat`

You are free to try any of the below quantization types by specifying `-t <type>`:
* `q4_0`: 4-bit integer quantization with fp16 scales.
Expand Down
7 changes: 1 addition & 6 deletions README_zh.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ python3 qwen_cpp/convert.py -i Qwen/Qwen1.5-1.8B-Chat -t q4_0 -o qwen2_1.8b-ggml
* Qwen1.5-72B: `Qwen/Qwen1.5-32B-Chat`
* Qwen1.5-MoeA2.7B: `Qwen/Qwen1.5-MoE-A2.7B-Chat`
* Llama-3-8B-Instruct: `meta-llama/Meta-Llama-3-8B-Instruct`
*Llama3-8B-Chinese-Chat : `shenzhi-wang/Llama3-8B-Chinese-Chat`
* Llama3-8B-Chinese-Chat : `shenzhi-wang/Llama3-8B-Chinese-Chat`

你可以通过指定 `-t <type>` 来尝试以下任何量化类型:
* `q4_0`:4 位整数量化, 使用 fp16 缩放。
Expand Down Expand Up @@ -118,11 +118,6 @@ llama3-chinese 示例
```
在交互模式下,你的聊天记录将作为下一轮对话的上下文。

llama3 chinese 示例

```
```

运行 `./build/bin/main -h` 查看更多选项!

Expand Down

0 comments on commit 0ddd87b

Please sign in to comment.