Replies: 1 comment
-
|
你应该想问这个 |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
自定义的 Critic 模型(价值网络),继承自基础的语言模型 MiniMindLM
其作用是评估当前生成状态(State)的价值 V(s)
class CriticModel(MiniMindForCausalLM):
def init(self, params):
super().init(params)
# 将原有的语言模型输出头(lm_head,输出词表大小)替换为一个线性层
# 该线性层将隐藏状态映射为单一的标量值(即该状态的价值)
self.value_head = nn.Linear(params.hidden_size, 1)
为什么 hidden_states = self.model.norm(outputs[0]) 中输入要使用outputs[0] 而不是output?
Beta Was this translation helpful? Give feedback.
All reactions