Fix #839 and #843: add explicit timeout to AsyncOpenAI and emit struc…#853
Fix #839 and #843: add explicit timeout to AsyncOpenAI and emit struc…#853gmrnlg1971 wants to merge 1 commit into
Conversation
…yncOpenAI and emit structured logs on unparseable LLM output
WalkthroughStructured-output failures now raise ChangesStructured output failure handling and OpenAI timeouts
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/deriver/deriver.py`:
- Around line 167-178: The error logging in deriver.py is too permissive because
Deriver.parse failure currently includes raw_response_excerpt, which may expose
user or inferred PII. Update the logger.error call in the deriver parsing
failure path to avoid emitting raw LLM text by default; instead log
parse_failure_reason plus safe metadata such as excerpt length and a stable
hash/fingerprint of the response. Keep any raw excerpt behind an explicit
diagnostic or PII-approved flag, and ensure the change is applied where e.reason
and e.raw_content are handled.
In `@src/llm/structured_output.py`:
- Around line 13-16: Move the StructuredOutputError custom exception out of
structured_output.py into src/exceptions.py so it becomes the canonical
exception definition, then import and use that class from
src/llm/structured_output.py and any other callers/handlers that reference it.
Keep the expanded __init__ contract (message, raw_content, reason) on the moved
exception class, and update all references to point at the centralized exception
module rather than defining the type locally.
- Around line 50-56: The new raise statements in the structured output error
handling exceed the Black 88-character limit. Reformat the two
StructuredOutputError raises in the exception handlers inside the JSON
repair/validation flow so they wrap cleanly within the line-length rule while
preserving the same message, raw_content, and reason arguments; use the existing
symbols StructuredOutputError, ValidationError, and the final/response_model
validation path to locate the statements.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 25a16b6a-661c-414b-8d4c-935cca4bb767
📒 Files selected for processing (5)
src/deriver/deriver.pysrc/embedding_client.pysrc/llm/backends/openai.pysrc/llm/registry.pysrc/llm/structured_output.py
| logger.error( | ||
| "Deriver parse failure", | ||
| extra={ | ||
| "work_unit_id": f"{observed}_{latest_message.session_name}", | ||
| "peer_id": observed, | ||
| "session_id": latest_message.session_name, | ||
| "transport": model_config.transport, | ||
| "model": model_config.model, | ||
| "structured_output_mode": model_config.structured_output_mode, | ||
| "parse_failure_reason": e.reason, | ||
| "raw_response_excerpt": e.raw_content[:500] if e.raw_content else "", | ||
| } |
There was a problem hiding this comment.
🔒 Security & Privacy | 🟠 Major | ⚡ Quick win
Avoid logging raw LLM output by default.
raw_response_excerpt can contain user message content or inferred PII from the deriver prompt. Prefer logging length/hash/reason by default, and only include raw excerpts behind an explicit diagnostic/PII-approved setting.
Proposed safer logging shape
"structured_output_mode": model_config.structured_output_mode,
"parse_failure_reason": e.reason,
- "raw_response_excerpt": e.raw_content[:500] if e.raw_content else "",
+ "raw_response_length": len(e.raw_content or ""),
+ "raw_response_hash": (
+ hashlib.sha256(e.raw_content.encode()).hexdigest()
+ if e.raw_content
+ else ""
+ ),This also requires adding:
+import hashlib📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| logger.error( | |
| "Deriver parse failure", | |
| extra={ | |
| "work_unit_id": f"{observed}_{latest_message.session_name}", | |
| "peer_id": observed, | |
| "session_id": latest_message.session_name, | |
| "transport": model_config.transport, | |
| "model": model_config.model, | |
| "structured_output_mode": model_config.structured_output_mode, | |
| "parse_failure_reason": e.reason, | |
| "raw_response_excerpt": e.raw_content[:500] if e.raw_content else "", | |
| } | |
| logger.error( | |
| "Deriver parse failure", | |
| extra={ | |
| "work_unit_id": f"{observed}_{latest_message.session_name}", | |
| "peer_id": observed, | |
| "session_id": latest_message.session_name, | |
| "transport": model_config.transport, | |
| "model": model_config.model, | |
| "structured_output_mode": model_config.structured_output_mode, | |
| "parse_failure_reason": e.reason, | |
| "raw_response_length": len(e.raw_content or ""), | |
| "raw_response_hash": ( | |
| hashlib.sha256(e.raw_content.encode()).hexdigest() | |
| if e.raw_content | |
| else "" | |
| ), | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/deriver/deriver.py` around lines 167 - 178, The error logging in
deriver.py is too permissive because Deriver.parse failure currently includes
raw_response_excerpt, which may expose user or inferred PII. Update the
logger.error call in the deriver parsing failure path to avoid emitting raw LLM
text by default; instead log parse_failure_reason plus safe metadata such as
excerpt length and a stable hash/fingerprint of the response. Keep any raw
excerpt behind an explicit diagnostic or PII-approved flag, and ensure the
change is applied where e.reason and e.raw_content are handled.
| def __init__(self, message: str, raw_content: str = "", reason: str = ""): | ||
| super().__init__(message) | ||
| self.raw_content = raw_content | ||
| self.reason = reason |
There was a problem hiding this comment.
📐 Maintainability & Code Quality | 🟠 Major | ⚡ Quick win
Keep custom exceptions in src/exceptions.py.
This change expands the StructuredOutputError contract, but the custom exception still lives in src/llm/structured_output.py. Move the exception type to src/exceptions.py and import it from there so downstream handlers use the canonical exception module. As per coding guidelines, "Define custom exception types in src/exceptions.py and use them throughout the codebase".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/llm/structured_output.py` around lines 13 - 16, Move the
StructuredOutputError custom exception out of structured_output.py into
src/exceptions.py so it becomes the canonical exception definition, then import
and use that class from src/llm/structured_output.py and any other
callers/handlers that reference it. Keep the expanded __init__ contract
(message, raw_content, reason) on the moved exception class, and update all
references to point at the centralized exception module rather than defining the
type locally.
Source: Coding guidelines
| except (json.JSONDecodeError, KeyError, TypeError, ValueError) as e: | ||
| raise StructuredOutputError(f"Failed to repair JSON: {e}", raw_content=raw_content, reason="empty_after_repair") from e | ||
|
|
||
| try: | ||
| return response_model.model_validate_json(final) | ||
| except ValidationError: | ||
| if response_model is PromptRepresentation: | ||
| return PromptRepresentation(explicit=[]) | ||
| raise | ||
| except ValidationError as e: | ||
| raise StructuredOutputError(f"Validation failed after repair: {e}", raw_content=raw_content, reason="validation_error") from e |
There was a problem hiding this comment.
📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win
Wrap the new raise statements to satisfy the 88-character limit.
Lines 51 and 56 exceed the project’s Black-compatible line length.
Proposed formatting fix
except (json.JSONDecodeError, KeyError, TypeError, ValueError) as e:
- raise StructuredOutputError(f"Failed to repair JSON: {e}", raw_content=raw_content, reason="empty_after_repair") from e
+ raise StructuredOutputError(
+ f"Failed to repair JSON: {e}",
+ raw_content=raw_content,
+ reason="empty_after_repair",
+ ) from e
try:
return response_model.model_validate_json(final)
except ValidationError as e:
- raise StructuredOutputError(f"Validation failed after repair: {e}", raw_content=raw_content, reason="validation_error") from e
+ raise StructuredOutputError(
+ f"Validation failed after repair: {e}",
+ raw_content=raw_content,
+ reason="validation_error",
+ ) from eAs per coding guidelines, "Enforce 88 character line length (Black compatible) in Python code".
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| except (json.JSONDecodeError, KeyError, TypeError, ValueError) as e: | |
| raise StructuredOutputError(f"Failed to repair JSON: {e}", raw_content=raw_content, reason="empty_after_repair") from e | |
| try: | |
| return response_model.model_validate_json(final) | |
| except ValidationError: | |
| if response_model is PromptRepresentation: | |
| return PromptRepresentation(explicit=[]) | |
| raise | |
| except ValidationError as e: | |
| raise StructuredOutputError(f"Validation failed after repair: {e}", raw_content=raw_content, reason="validation_error") from e | |
| except (json.JSONDecodeError, KeyError, TypeError, ValueError) as e: | |
| raise StructuredOutputError( | |
| f"Failed to repair JSON: {e}", | |
| raw_content=raw_content, | |
| reason="empty_after_repair", | |
| ) from e | |
| try: | |
| return response_model.model_validate_json(final) | |
| except ValidationError as e: | |
| raise StructuredOutputError( | |
| f"Validation failed after repair: {e}", | |
| raw_content=raw_content, | |
| reason="validation_error", | |
| ) from e |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/llm/structured_output.py` around lines 50 - 56, The new raise statements
in the structured output error handling exceed the Black 88-character limit.
Reformat the two StructuredOutputError raises in the exception handlers inside
the JSON repair/validation flow so they wrap cleanly within the line-length rule
while preserving the same message, raw_content, and reason arguments; use the
existing symbols StructuredOutputError, ValidationError, and the
final/response_model validation path to locate the statements.
Source: Coding guidelines
…tured logs on unparseable LLM output
Summary by CodeRabbit