Structured outputs via llguidance#302
Open
pminervini wants to merge 12 commits into
Open
Conversation
Ensure ds4, ds4-server, ds4-bench, ds4-eval, ds4-agent, and cpu targets depend on libllguidance.a when LLGUIDANCE=1, so that `cargo build` runs before linking. Previously only ds4-server triggered the build via ds4_llguidance.o, causing other binaries to fail linking against a nonexistent library.
Introduce a distclean target that runs clean and then removes the .deps directory (cloned llguidance source + Rust build artifacts). This avoids forcing a re-clone on every `make clean` while giving users an explicit way to fully reset when needed.
Expose regex, Lark, and llguidance structured-output formats through the existing Chat Completions and Responses structured-output surfaces, reusing the current llguidance constrained decoder.
Bring the branch up to date with antirez/ds4 main while preserving llguidance structured-output support and the updated build wiring.
Author
|
Quick update: structured outputs no longer disable thinking: ds4 now lets the model finish and then applies the llguidance constraint to the final answer. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This should address #210.
Here is another stab at structured outputs, but instead of the custom JSON-schema decoder from #247, this uses llguidance behind an optional make LLGUIDANCE=1 build. So the implementation is closer to how llama.cpp does it, and it supports the OpenAI JSON modes plus regex, Lark, and raw llguidance grammars for both chat completions and Responses.
Compared to #247, this should be less ds4-specific code to maintain and it covers more formats. Current caveat: structured outputs disable thinking for that turn (looking into this) and are not combined with tools.
I tested the llguidance build, ds4_test, ds4-server, and a local structured-output stress-sweep against ds4-server. Poking @fry69 and @nhwaani since they were interested.