Replies: 3 comments
-
调用stream_infer的时候,sequence_start,sequence_end都设置为True即可 |
Beta Was this translation helpful? Give feedback.
0 replies
-
sequence_end都设置为True,多轮对话就关闭掉了,怎么在多轮对话开启的时候关闭 kv cache,huggingface model 加载是通过 use_cache 配置开关 kv cache 的,lmdeploy 有类似的配置项吗 @lvhan028 |
Beta Was this translation helpful? Give feedback.
0 replies
-
sequence_end并不表示多轮对话关闭了。它意味着,调用方(用户)负责拼接历史prompt和当前的prompt |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
rt.
Beta Was this translation helpful? Give feedback.
All reactions