Skip to content

fix: normalize WebSocket URI path to prevent /task/service duplicationFix/websocket uri 755#778

Open
koushik1359 wants to merge 1 commit intorocketride-org:developfrom
koushik1359:fix/websocket-uri-755
Open

fix: normalize WebSocket URI path to prevent /task/service duplicationFix/websocket uri 755#778
koushik1359 wants to merge 1 commit intorocketride-org:developfrom
koushik1359:fix/websocket-uri-755

Conversation

@koushik1359
Copy link
Copy Markdown

@koushik1359 koushik1359 commented May 7, 2026

Problem

_get_websocket_uri built the final URL via string concatenation:

return f'{ws_uri.geturl()}/task/service'

This produced malformed URLs in two cases:

  1. If the caller passed a URI that already contained /task/service
    (e.g. after reconnect), the suffix was appended again, yielding
    .../task/service/task/service.
  2. A trailing slash on the input produced a double slash:
    ws://localhost:5565//task/service.

Fix

Strip the trailing slash, guard against an existing /task/service
suffix, and construct the final URL with urllib.parse.urlunparse
instead of string formatting:

path = parsed.path.rstrip('/')
if not path.endswith('/task/service'):
    path = path + '/task/service'
return urllib.parse.urlunparse((ws_scheme, parsed.netloc, path, '', '', ''))

Tests

Added two parametrized cases to the existing
test_get_websocket_uri_normalization suite covering both failure modes.

Resolves #755

Summary by CodeRabbit

  • Bug Fixes
    • Improved WebSocket connection URL handling to properly normalize paths and prevent duplicate segments or malformed URLs when base URLs contain trailing slashes or existing path segments.

@github-actions github-actions Bot added module:client-python Python SDK and MCP client module:ai AI/ML modules labels May 7, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 7, 2026

Review Change Stack
No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 44364460-b025-461d-afda-ca197e7c17b2

📥 Commits

Reviewing files that changed from the base of the PR and between aebed80 and 5f6dec4.

📒 Files selected for processing (2)
  • packages/client-python/src/rocketride/mixins/connection.py
  • packages/client-python/tests/RocketRideClient_test.py

📝 Walkthrough

Walkthrough

The PR refactors WebSocket URI construction in the Python client to use urllib.parse.urlunparse() with proper path normalization. The changes ensure that /task/service is appended only once and trailing slashes do not create malformed URLs. Updated tests validate these normalization edge cases.

Changes

WebSocket URI Normalization

Layer / File(s) Summary
URI Construction
packages/client-python/src/rocketride/mixins/connection.py
_get_websocket_uri() refactored to parse input URI, determine ws/wss scheme, strip trailing slashes from path, conditionally append /task/service, and construct final address via urllib.parse.urlunparse().
Normalization Tests
packages/client-python/tests/RocketRideClient_test.py
test_get_websocket_uri_normalization parameterized cases extended to verify /task/service is not duplicated when already present and trailing slashes do not produce double slashes.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related issues

  • #737: The changes to ConnectionMixin._get_websocket_uri directly address the same URI-normalization bug (trailing-slash and duplicate /task/service handling) reported in this issue.

Possibly related PRs

  • rocketride-org/rocketride-server#326: Both PRs modify ConnectionMixin._get_websocket_uri in packages/client-python/src/rocketride/mixins/connection.py—this one normalizes and constructs the path (/task/service) while PR #326 adjusts scheme mapping (wss preservation).

Suggested reviewers

  • jmaionchi
  • Rod-Christensen
  • stepmikhaylov

Poem

🐰 A path must be clean, no doubles allowed,
urlunparse speaks where old schemes bowed,
With /task/service placed just right,
And trailing slashes trimmed from sight,
WebSocket URIs now shine bright! 🌟

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title clearly summarizes the main change: normalizing WebSocket URI path to prevent /task/service duplication, which directly reflects the core fix implemented in the changeset.
Linked Issues check ✅ Passed The PR implementation fully addresses issue #755 objectives: strips trailing slashes, prevents /task/service duplication, handles input URL variations correctly, and adds parametrized test cases covering both failure modes (already-present suffix and trailing-slash scenarios).
Out of Scope Changes check ✅ Passed All changes are scoped to the linked issue #755: modifications to _get_websocket_uri() method and test updates are directly related to WebSocket URI normalization without introducing unrelated functionality.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 7, 2026

No description provided.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
packages/ai/src/ai/common/store.py (1)

157-157: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Pre-existing critical bug: score update replaces Doc with a plain number.

Line 157 writes tableDocs[tableKey] = doc.score, which overwrites the stored Doc object with a float. Any subsequent access — including line 189's tableDocs[tableKey].page_content += ... — will raise AttributeError: 'float' object has no attribute 'page_content'. The intent is clearly to update the score field on the existing entry.

🐛 Proposed fix
-            if doc.score > tableDocs[tableKey].score:
-                tableDocs[tableKey] = doc.score
+            if doc.score > tableDocs[tableKey].score:
+                tableDocs[tableKey].score = doc.score
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/ai/src/ai/common/store.py` at line 157, The bug is that
tableDocs[tableKey] = doc.score replaces the stored Doc with a float; instead,
update the score attribute on the stored Doc object (e.g.,
tableDocs[tableKey].score = doc.score) and only replace the whole entry if no
Doc exists yet (i.e., if tableDocs.get(tableKey) is None then set
tableDocs[tableKey] = doc). Update the logic around tableDocs, tableKey, and doc
so later accesses like tableDocs[tableKey].page_content remain valid.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/ai/src/ai/common/store.py`:
- Around line 184-189: The code clears chunk.page_content before appending, but
when tableDocs[tableKey] is set to chunk they reference the same object so the
original text is lost; fix by capturing the chunk's original content into a
local variable (e.g., original = chunk.page_content) before clearing or
assigning into tableDocs, then use that captured original when doing
tableDocs[tableKey].page_content += original; ensure references to tableDocs,
chunk, tableKey and page_content are used so the initialization branch preserves
the chunk text instead of appending an already-cleared string.

---

Outside diff comments:
In `@packages/ai/src/ai/common/store.py`:
- Line 157: The bug is that tableDocs[tableKey] = doc.score replaces the stored
Doc with a float; instead, update the score attribute on the stored Doc object
(e.g., tableDocs[tableKey].score = doc.score) and only replace the whole entry
if no Doc exists yet (i.e., if tableDocs.get(tableKey) is None then set
tableDocs[tableKey] = doc). Update the logic around tableDocs, tableKey, and doc
so later accesses like tableDocs[tableKey].page_content remain valid.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 324aba12-d814-4fbf-ad50-ab690b2efb57

📥 Commits

Reviewing files that changed from the base of the PR and between d7b29aa and aebed80.

📒 Files selected for processing (3)
  • packages/ai/src/ai/common/store.py
  • packages/client-python/src/rocketride/mixins/connection.py
  • packages/client-python/tests/RocketRideClient_test.py

Comment on lines 184 to 189
if tableKey not in tableDocs:
# Add it to the list
# TODO: Fix this
tableDocs[tableKey] = Doc(objectId=objectId, chunk=doc.metadata.chunkId, score=chunk.score)
chunk.page_content = ''
tableDocs[tableKey] = chunk

# Append the text
tableDocs[tableKey].page_content += chunk.page_content
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Self-reference causes first chunk's content to be silently dropped.

After lines 185–186, tableDocs[tableKey] and chunk are the same object. Line 189 then evaluates as:

chunk.page_content += chunk.page_content   # '' += '' → ''

Because chunk.page_content was just cleared on line 185, both sides of += are '' and the original text of this chunk is lost. The correct pattern is to either append before clearing, use a continue to skip the unconditional +=, or restructure so the accumulation handles the initialisation case separately:

🐛 Proposed fix
             if tableKey not in tableDocs:
-                chunk.page_content = ''
                 tableDocs[tableKey] = chunk
+                tableDocs[tableKey].page_content = ''

             # Append the text
             tableDocs[tableKey].page_content += chunk.page_content

Moving the clear onto tableDocs[tableKey] after the assignment is the same object, but by separating the steps the intent is unambiguous. Alternatively, capture the content first:

             if tableKey not in tableDocs:
+                original = chunk.page_content
                 chunk.page_content = ''
                 tableDocs[tableKey] = chunk
+                tableDocs[tableKey].page_content += original
+                continue

             # Append the text
             tableDocs[tableKey].page_content += chunk.page_content

Note: Because Phase 1 inserts every (objectId, tableId) present in documents into tableDocs before Phase 2 runs, this if tableKey not in tableDocs branch is currently unreachable in normal flow. The bug is latent, but the defensive guard should still be correct.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/ai/src/ai/common/store.py` around lines 184 - 189, The code clears
chunk.page_content before appending, but when tableDocs[tableKey] is set to
chunk they reference the same object so the original text is lost; fix by
capturing the chunk's original content into a local variable (e.g., original =
chunk.page_content) before clearing or assigning into tableDocs, then use that
captured original when doing tableDocs[tableKey].page_content += original;
ensure references to tableDocs, chunk, tableKey and page_content are used so the
initialization branch preserves the chunk text instead of appending an
already-cleared string.

String concatenation in _get_websocket_uri produced malformed URLs in
two cases:
- If the URI already contained /task/service, it was appended again
- A trailing slash produced a double slash before /task/service

Fix: strip trailing slash, check for existing /task/service suffix,
then build the final URL with urlunparse instead of string formatting.
Added two parametrized test cases covering both failure modes.

Resolves rocketride-org#755
@koushik1359 koushik1359 force-pushed the fix/websocket-uri-755 branch from aebed80 to 5f6dec4 Compare May 7, 2026 12:13
@github-actions github-actions Bot removed the module:ai AI/ML modules label May 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

module:client-python Python SDK and MCP client

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant