-
Notifications
You must be signed in to change notification settings - Fork 42
Open
Description
rollout()函数中调用tokenizer.batch_decode后得到的completions为何不是字符串列表,而是token IDs?
sequence_ids = model.generate(**model_inputs, generation_config=generation_config) # [8, 1024]
completions = tokenizer.batch_decode(
sequence_ids[:, input_ids.shape[1]:], skip_special_tokens=True
)
跑了一晚上,解码的结果依然如下所示的token列表,而非字符串:
'<|SPEECH_GENERATION_START|><|s_64943|><|s_33339|><|s_63590|>,...'
但接下来计算reward时又是通过字符串来匹配,导致answer_match一直是None,reward一直是0:
answer_match = re.search(
r"(.*?)",
completion,
flags=re.DOTALL,
)
训很久后returns一直是0:
请问,可能是什么原因?
Metadata
Metadata
Assignees
Labels
No labels
