Skip to content

fix(client): Harden upload validation and conversion flow#989

Open
Gujiassh wants to merge 6 commits intobytedance:mainfrom
Gujiassh:fix/upload-conversion-guardrails
Open

fix(client): Harden upload validation and conversion flow#989
Gujiassh wants to merge 6 commits intobytedance:mainfrom
Gujiassh:fix/upload-conversion-guardrails

Conversation

@Gujiassh
Copy link

@Gujiassh Gujiassh commented Mar 6, 2026

Summary

  • reject directory paths before upload copying so the request stays all-or-nothing
  • reuse a single conversion worker when uploads are converted inside an active event loop
  • document the updated upload behavior in the backend README and CLAUDE guide

Testing

  • uv run pytest tests/test_client.py -q

@CLAassistant
Copy link

CLAassistant commented Mar 6, 2026

CLA assistant check
All committers have signed the CLA.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens the embedded DeerFlowClient upload flow to better match Gateway expectations by validating inputs up-front and improving the document-conversion execution strategy when invoked under an active asyncio event loop, with corresponding documentation updates.

Changes:

  • Reject non-file inputs (e.g., directory paths) before copying any uploads to keep uploads all-or-nothing.
  • Reuse a single conversion worker (thread) per request when conversions must run from within an active event loop.
  • Update backend docs (README + CLAUDE guide) and extend client tests for the new upload validation behavior.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
backend/tests/test_client.py Adds coverage for directory-path rejection in upload_files and reformats several test blocks.
backend/src/client.py Adds upfront is_file() validation, reworks conversion execution to reuse a single executor under a running event loop, and switches list_uploads to os.scandir.
backend/README.md Documents that uploads reject directory paths.
backend/CLAUDE.md Documents new upload validation + conversion worker reuse behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 731 to +734
if not p.exists():
raise FileNotFoundError(f"File not found: {f}")
if not p.is_file():
raise ValueError(f"Path is not a file: {f}")
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

upload_files() now raises ValueError when an input path exists but is not a regular file (e.g., a directory). The docstring’s Raises: section only documents FileNotFoundError; please document the ValueError case as well (and any other newly introduced exceptions) so callers know what to expect.

Copilot uses AI. Check for mistakes.
Comment on lines +742 to +753
conversion_pool = None
if has_convertible_file:
try:
asyncio.get_running_loop()
except RuntimeError:
conversion_pool = None
else:
import concurrent.futures

dest = uploads_dir / src_path.name
shutil.copy2(src_path, dest)
# Reuse one worker when already inside an event loop to avoid
# creating a new ThreadPoolExecutor per converted file.
conversion_pool = concurrent.futures.ThreadPoolExecutor(max_workers=1)
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new branch that creates/reuses a single ThreadPoolExecutor when upload_files() is called from inside an active event loop is not covered by tests. Please add an async test (e.g., pytest.mark.asyncio) that calls upload_files() with multiple convertible files while an event loop is running and asserts only one executor is created (and conversion still succeeds/fails gracefully).

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants