-
Notifications
You must be signed in to change notification settings - Fork 680
fix: pass custom_instructions to session summaries #775
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
Oxygen56
wants to merge
10
commits into
plastic-labs:main
Choose a base branch
from
Oxygen56:fix/issue-748-summarizer-custom-instructions
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+171
−18
Open
Changes from 4 commits
Commits
Show all changes
10 commits
Select commit
Hold shift + click to select a range
f76591a
fix: pass custom_instructions to session summaries
Oxygen56 e961335
docs: add missing docstrings to achieve 80% coverage threshold
Oxygen56 6525a6b
fix: address review feedback — add custom_instructions to SummaryConf…
Oxygen56 b6178cc
fix: address review feedback — add custom_instructions to server conf…
Oxygen56 3cc787e
fix: forward custom_instructions in estimate_long; switch to lru_cache
Oxygen56 edc7daf
docs: clarify SummaryConfiguration uses deriver token cap
Oxygen56 055d8f1
style: wrap long function signatures to comply with 88-char line limit
Oxygen56 5e601ab
fix: preserve summary custom instruction clears
Oxygen56 e31a9ee
fix: allow message summary configuration overrides
Oxygen56 93e3f29
test: cover summary custom instruction clears
Oxygen56 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -14,6 +14,8 @@ | |
| from src.config import ConfiguredModelSettings, settings | ||
| from src.crud.session import session_cache_key | ||
| from src.dependencies import tracked_db | ||
| # TODO: move _custom_instructions_section to shared utility | ||
| from src.deriver.prompts import _custom_instructions_section | ||
| from src.exceptions import ResourceNotFoundException | ||
| from src.llm import HonchoLLMCallResponse, honcho_llm_call | ||
| from src.llm.types import LLMTelemetryContext | ||
|
|
@@ -57,6 +59,7 @@ class Summary(TypedDict): | |
|
|
||
|
|
||
| def to_schema_summary(s: Summary) -> schemas.Summary: | ||
| """Convert a Summary TypedDict to a Pydantic Summary schema object.""" | ||
| return schemas.Summary( | ||
| content=s["content"], | ||
| message_id=s["message_id"], | ||
|
|
@@ -81,6 +84,7 @@ def to_schema_summary(s: Summary) -> schemas.Summary: | |
|
|
||
|
|
||
| def _get_summary_model_config() -> ConfiguredModelSettings: | ||
| """Return the configured model settings for summary generation.""" | ||
| return settings.SUMMARY.MODEL_CONFIG | ||
|
|
||
|
|
||
|
|
@@ -101,8 +105,10 @@ def short_summary_prompt( | |
| formatted_messages: str, | ||
| output_words: int, | ||
| previous_summary_text: str, | ||
| custom_instructions: str | None = None, | ||
| ) -> str: | ||
| """Generate the short summary prompt.""" | ||
| custom_instructions_section = _custom_instructions_section(custom_instructions) | ||
| return c(f""" | ||
| You are a system that summarizes parts of a conversation to create a concise and accurate summary. Focus on capturing: | ||
|
|
||
|
|
@@ -117,6 +123,7 @@ def short_summary_prompt( | |
|
|
||
| Return only the summary without any explanation or meta-commentary. | ||
|
|
||
| {custom_instructions_section} | ||
| <previous_summary> | ||
| {previous_summary_text} | ||
| </previous_summary> | ||
|
|
@@ -133,8 +140,10 @@ def long_summary_prompt( | |
| formatted_messages: str, | ||
| output_words: int, | ||
| previous_summary_text: str, | ||
| custom_instructions: str | None = None, | ||
| ) -> str: | ||
| """Generate the long summary prompt.""" | ||
| custom_instructions_section = _custom_instructions_section(custom_instructions) | ||
| return c(f""" | ||
| You are a system that creates thorough, comprehensive summaries of conversations. Focus on capturing: | ||
|
|
||
|
|
@@ -151,6 +160,7 @@ def long_summary_prompt( | |
|
|
||
| Return only the summary without any explanation or meta-commentary. | ||
|
|
||
| {custom_instructions_section} | ||
| <previous_summary> | ||
| {previous_summary_text} | ||
| </previous_summary> | ||
|
|
@@ -164,45 +174,60 @@ def long_summary_prompt( | |
|
|
||
|
|
||
| @cache | ||
| def estimate_short_summary_prompt_tokens() -> int: | ||
| """Estimate tokens for the short summary prompt (without messages/previous_summary).""" | ||
| def estimate_short_summary_prompt_tokens(custom_instructions: str | None = None) -> int: | ||
|
coderabbitai[bot] marked this conversation as resolved.
Outdated
|
||
| """Estimate tokens for the short summary prompt, optionally including custom instructions.""" | ||
| try: | ||
| return estimate_tokens( | ||
| short_summary_prompt( | ||
| formatted_messages="", | ||
| output_words=0, | ||
| previous_summary_text="", | ||
| custom_instructions=custom_instructions, | ||
| ) | ||
| ) | ||
| except Exception: | ||
| # Return a rough estimate if estimation fails | ||
| return 200 | ||
|
|
||
|
|
||
| @cache | ||
| def estimate_long_summary_prompt_tokens() -> int: | ||
| """Estimate tokens for the long summary prompt (without messages/previous_summary).""" | ||
| def estimate_long_summary_prompt_tokens(custom_instructions: str | None = None) -> int: | ||
| """Estimate tokens for the long summary prompt, optionally including custom instructions.""" | ||
| try: | ||
| return estimate_tokens( | ||
| long_summary_prompt( | ||
| formatted_messages="", | ||
| output_words=0, | ||
| previous_summary_text="", | ||
| custom_instructions=None, | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. should be |
||
| ) | ||
| ) | ||
| except Exception: | ||
| # Return a rough estimate if estimation fails | ||
| return 200 | ||
|
|
||
|
|
||
|
|
||
| @conditional_observe(name="Create Short Summary") | ||
| async def create_short_summary( | ||
| formatted_messages: str, | ||
| input_tokens: int, | ||
| previous_summary: str | None = None, | ||
| custom_instructions: str | None = None, | ||
| *, | ||
| workspace_name: str | None = None, | ||
| ) -> HonchoLLMCallResponse[str]: | ||
| """ | ||
| Generate a short summary via an LLM call. | ||
|
|
||
| Args: | ||
| formatted_messages: Pre-formatted conversation messages. | ||
| input_tokens: Token count of the input (messages + previous summary). | ||
| previous_summary: Previous summary text for continuity, if any. | ||
| custom_instructions: Optional custom instructions from configuration. | ||
| workspace_name: Workspace name for telemetry attribution. | ||
|
|
||
| Returns: | ||
| The LLM response containing the short summary text and token counts. | ||
| """ | ||
| # input_tokens indicates how many tokens the message list + previous summary take up | ||
| # we want to optimize short summaries to be smaller than the actual content being summarized | ||
| # so we ask the agent to produce a word count roughly equal to either the input, or the max | ||
|
|
@@ -216,7 +241,10 @@ async def create_short_summary( | |
| previous_summary_text = "There is no previous summary -- the messages are the beginning of the conversation." | ||
|
|
||
| prompt = short_summary_prompt( | ||
| formatted_messages, output_words, previous_summary_text | ||
| formatted_messages, | ||
| output_words, | ||
| previous_summary_text, | ||
| custom_instructions=custom_instructions, | ||
| ) | ||
|
|
||
| return await honcho_llm_call( | ||
|
|
@@ -235,9 +263,22 @@ async def create_short_summary( | |
| async def create_long_summary( | ||
| formatted_messages: str, | ||
| previous_summary: str | None = None, | ||
| custom_instructions: str | None = None, | ||
| *, | ||
| workspace_name: str | None = None, | ||
| ) -> HonchoLLMCallResponse[str]: | ||
| """ | ||
| Generate a comprehensive long summary via an LLM call. | ||
|
|
||
| Args: | ||
| formatted_messages: Pre-formatted conversation messages. | ||
| previous_summary: Previous summary text for continuity, if any. | ||
| custom_instructions: Optional custom instructions from configuration. | ||
| workspace_name: Workspace name for telemetry attribution. | ||
|
|
||
| Returns: | ||
| The LLM response containing the long summary text and token counts. | ||
| """ | ||
| # the word/token ratio is roughly 4:3 so we multiply by 0.75. | ||
| # LLMs *seem* to respond better to getting asked for a word count but should workshop this. | ||
| output_words = int(settings.SUMMARY.MAX_TOKENS_LONG * 0.75) | ||
|
|
@@ -248,7 +289,10 @@ async def create_long_summary( | |
| previous_summary_text = "There is no previous summary -- the messages are the beginning of the conversation." | ||
|
|
||
| prompt = long_summary_prompt( | ||
| formatted_messages, output_words, previous_summary_text | ||
| formatted_messages, | ||
| output_words, | ||
| previous_summary_text, | ||
| custom_instructions=custom_instructions, | ||
| ) | ||
|
|
||
| return await honcho_llm_call( | ||
|
|
@@ -439,6 +483,13 @@ async def _create_and_save_summary( | |
| previous_summary_tokens = latest_summary["token_count"] if latest_summary else 0 | ||
| input_tokens = messages_tokens + previous_summary_tokens | ||
|
|
||
| # Extract custom_instructions from the summarizer's own configuration. | ||
| # This is separate from reasoning custom_instructions — workspace | ||
| # operators may want summaries in a different style than deriver output. | ||
| custom_instructions: str | None = None | ||
| if configuration.summary and configuration.summary.custom_instructions is not None: | ||
| custom_instructions = configuration.summary.custom_instructions | ||
|
|
||
| ( | ||
| new_summary, | ||
| is_fallback, | ||
|
|
@@ -453,16 +504,21 @@ async def _create_and_save_summary( | |
| last_message_id=last_message_id, | ||
| last_message_content_preview=last_message_content_preview, | ||
| message_count=message_count, | ||
| custom_instructions=custom_instructions, | ||
| workspace_name=workspace_name, | ||
| ) | ||
|
|
||
| # Compute scaffold tokens up front (cheap + idempotent) so both the | ||
| # save-summary path and the telemetry emit below can use it | ||
| # without basedpyright tripping on a possibly-unbound name. | ||
| if summary_type == SummaryType.SHORT: | ||
| prompt_tokens = estimate_short_summary_prompt_tokens() | ||
| prompt_tokens = estimate_short_summary_prompt_tokens( | ||
| custom_instructions | ||
| ) | ||
| else: | ||
| prompt_tokens = estimate_long_summary_prompt_tokens() | ||
| prompt_tokens = estimate_long_summary_prompt_tokens( | ||
| custom_instructions | ||
| ) | ||
|
|
||
| # Step 3: Save to database with new transaction | ||
| if not is_fallback: | ||
|
|
@@ -552,6 +608,7 @@ async def _create_summary( | |
| last_message_id: int, | ||
| last_message_content_preview: str, | ||
| message_count: int, | ||
| custom_instructions: str | None = None, | ||
| *, | ||
| workspace_name: str | None = None, | ||
| ) -> tuple[Summary, bool, int, int]: | ||
|
|
@@ -567,6 +624,8 @@ async def _create_summary( | |
| last_message_id: ID of the last message | ||
| last_message_content_preview: Preview of last message content for fallback | ||
| message_count: Number of messages for fallback | ||
| custom_instructions: Optional workspace-level custom instructions for prompt | ||
| workspace_name: Optional workspace name for telemetry | ||
|
|
||
| Returns: | ||
| A tuple of (Summary, is_fallback, llm_input_tokens, llm_output_tokens) | ||
|
|
@@ -585,12 +644,14 @@ async def _create_summary( | |
| formatted_messages, | ||
| input_tokens, | ||
| previous_summary_text, | ||
| custom_instructions=custom_instructions, | ||
| workspace_name=workspace_name, | ||
| ) | ||
| else: | ||
| response = await create_long_summary( | ||
| formatted_messages, | ||
| previous_summary_text, | ||
| custom_instructions=custom_instructions, | ||
| workspace_name=workspace_name, | ||
| ) | ||
|
|
||
|
|
||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is validating against the deriver's generic
MAX_CUSTOM_INSTRUCTIONS_TOKENScap, not a summarizer-specific one. i think that's fine, but make that clear in the description