[REQUEST] Imporve compatibility with reasoning template And/Or Implement API response field "reasoning_content" like R1 API/llama.cpp

### Problem

The chat templates of R1/QwQ auto append a <think> tag in front of generation to force thinking but this breaks some UI, eg. https://github.com/ggml-org/llama.cpp/issues/11861

### Solution

Detect the thinking template and return "<think>" in advance of generated content to client.
For reference:
https://github.com/ggml-org/llama.cpp/pull/11607
Solve #294 #299 too.

### Alternatives

Hmm.. None?

### Acknowledgements

- [x] I have looked for similar requests before submitting this one.
- [x] I understand that the developers have lives and my issue will be answered when possible.
- [x] I understand the developers of this program are human, and I will make my requests politely.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[REQUEST] Imporve compatibility with reasoning template And/Or Implement API response field "reasoning_content" like R1 API/llama.cpp #309

Problem

Solution

Alternatives

Acknowledgements

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[REQUEST] Imporve compatibility with reasoning template And/Or Implement API response field "reasoning_content" like R1 API/llama.cpp #309

Description

Problem

Solution

Alternatives

Acknowledgements

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions