-
Notifications
You must be signed in to change notification settings - Fork 171
[Question]: GLM-5-W8A8如何配置tool-call-parser和reasoning_parser #1004
Description
❓ Describe the question
启动脚本
`
[root@localhost home]# cat ./start_glm5.sh
#!/bin/bash
BATCH_SIZE=256
XLLM_PATH="./xllm_v0.8.0/xllm/build/xllm/core/server/xllm"
MODEL_PATH=/home/GLM-5-W8A8/
DRAFT_MODEL_PATH=/home/GLM-5-MTP/
MASTER_NODE_ADDR="0.0.0.0:10015"
LOCAL_HOST="0.0.0.0"
START_PORT=18994
START_DEVICE=0
LOG_DIR="logs"
NNODES=16
for (( i=0; i<$NNODES;i++))
do
PORT=$((START_PORT + i))
DEVICE=$((START_DEVICE + i))
LOG_FILE="$LOG_DIR/node_$i.log"
nohup numactl -C $((DEVICE24))-$((DEVICE24+23)) $XLLM_PATH
--model $MODEL_PATH
--port $PORT
--devices="npu:$DEVICE"
--master_node_addr=$MASTER_NODE_ADDR
--nnodes=$NNODES
--node_rank=$i
--max_memory_utilization=0.85
--max_tokens_per_batch=8192
--max_seqs_per_batch=64
--block_size=128
--enable_prefix_cache=false
--enable_chunked_prefill=true
--communication_backend="hccl"
--enable_schedule_overlap=true
--enable_graph=true
--enable_graph_no_padding=true
--enable_mla=true
--draft_model=$DRAFT_MODEL_PATH
--draft_devices="npu:$DEVICE"
--num_speculative_tokens=1
--ep_size=16
--dp_size=1
--tool-call-parser=glm47
--reasoning_parser=glm45
> $LOG_FILE 2>&1 &
done
配置完后,能通过postman发个hello没有问题,但是让claude code读取源码分析项目卡死:
你好
● The user is asking about the codebase structure. I'll use the Read tool to read the file /root/CLAUDE.md file.I'll use the Read tool to read the file /root/CLAUDE.md file.The Read tool to read the file /root/CLAUDE.md file. /
I can read the file /root/CLAUDE.md file using the Read tool.Read tool (file_path="/root/CLAUDE.md")file_path: /root/CLAUDE.md
limit: 1000
offset: 0
multiline: false
output_mode: "content"pattern: "CLAUDE.md"pattern: "CLAUDE.md"glob:: "/.md"glob: {"pattern": "/*.md", "path": "/root"}limit: 1000
offset: 0
multiline: false
output_mode: "content", path: "/root"}
请读取当前BMC代码,告诉我项目架构设计
✢ Crafting… (esc to interrupt · 15m 16s · ↓ 0 tokens)
`