Skip to content

fix: cherry-pick PR #334 ws lifecycle bridge#1

Open
bhintrapreneur wants to merge 6 commits intomainfrom
mm/p0-a1-pr334-bridge
Open

fix: cherry-pick PR #334 ws lifecycle bridge#1
bhintrapreneur wants to merge 6 commits intomainfrom
mm/p0-a1-pr334-bridge

Conversation

@bhintrapreneur
Copy link
Copy Markdown
Owner

@bhintrapreneur bhintrapreneur commented May 7, 2026

Summary

MonkeyMoney bridge context

This branch is a bounded P0-A1 leak-stop bridge. It is not the long-term Polymarket Rust client destination. Do not extend this V1 fork beyond the Gate A scope.

Gate A candidate:

Verification

Local:

  • cargo +nightly-2025-11-24 fmt --all -- --check
  • cargo +1.88 clippy --all-targets --all-features -- -D warnings
  • cargo +1.88 clippy --all-targets -- -D warnings
  • cargo +1.88 test --features "clob ws" close_tests -- --nocapture
  • cargo +1.88 test --all-features
  • cargo +1.88 test
  • cargo +1.88 build --all-targets --all-features
  • git diff --check

Fork CI:

  • cargo-sort: pass
  • fmt-lint: pass
  • build-test (ubuntu) with coverage: pass
  • build-test (macos): pass
  • build-test (windows): pass
  • PR title validation: pass

Gate A review notes

skyc1e and others added 4 commits May 7, 2026 22:47
`SubscriptionManager::start_reconnection_handler` (both
`clob::ws::subscription` and `rtds::subscription`) spawns a detached
`tokio` task that holds a strong `Arc<SubscriptionManager>` clone. The
task also owns a cloned `ConnectionManager`, whose `watch::Sender` is
what the task awaits on for state changes. Because the `Sender` only
closes when every clone of it drops, and the strong `Arc` clone inside
the spawned task prevents the owning manager (and therefore that
`ConnectionManager` clone) from ever dropping, the task can never exit
on its own — a reference cycle that leaked the entire channel (task,
WebSocket, subscription manager) for the lifetime of the process every
time a user dropped a `WsClient`.

`start_reconnection_handler` now returns its `JoinHandle<()>`. A new
`ws::task::AbortOnDrop` wrapper (thin `JoinHandle` + `Drop` that calls
`abort()`) is stored next to the `Arc<SubscriptionManager>` inside
`clob::ws::ChannelResources` and `rtds::ClientInner`. When the owning
client drops, the wrapper drops, the task is aborted, its stack locals
(including the strong `Arc` clone) are released, and the whole graph
can drop normally.

Public API is unchanged. A regression test in `clob::ws::subscription`
constructs a `SubscriptionManager`, drops it together with the
`AbortOnDrop`, and asserts the `Weak` downgrade cannot upgrade — i.e.
the strong count reached zero. Unit tests for `AbortOnDrop` cover the
cancel-pending and already-finished paths.

Closes Polymarket#325
Codecov flagged the reconnect-handle plumbing on the RTDS side as
uncovered — `Client::new`, `Client::authenticate`,
`Client::deauthenticate`, and `SubscriptionManager::start_reconnection_handler`
each had zero patch coverage, so the regression fence for issue Polymarket#325
only caught the clob::ws path.

Add focused regression tests without widening the public surface:

* `rtds::subscription::reconnect_handler_tests` mirrors the clob::ws
  test: constructs an `Arc<SubscriptionManager>`, calls
  `start_reconnection_handler`, wraps in `AbortOnDrop`, drops, and
  asserts the `Weak` downgrade cannot upgrade. Proves the same Arc
  cycle is broken on the RTDS code path.
* `rtds::client::teardown_tests` covers the three `reconnect_handle`
  field sites in `Client::{new, authenticate, deauthenticate}` via
  `tokio::test`s that poll a `Weak<SubscriptionManager>` after drop.
  `authenticate` and `deauthenticate` tests also assert the manager
  is _still_ reachable mid-round-trip, ensuring the `AbortOnDrop` is
  forwarded into the new `ClientInner` rather than dropped early.
* A `#[cfg(test)]`-only `inner_subscriptions_for_test` accessor on
  `Client<S>` provides the weak-ref probe without exposing the
  private `ClientInner.subscriptions` field. Gated behind
  `#[expect(clippy::multiple_inherent_impl, ...)]` with a reason, per
  crate style.

These tests exercise the same `AbortOnDrop::new` + `Drop` paths the
existing `ws::task::tests` already cover, transitively raising
`ws/task.rs` coverage as well.
…olymarket#334)

cargo fmt --all produced no working-tree changes under the current stable toolchain. This commit records Phase 1.3 and triggers fork CI on a MonkeyMoney-controlled branch without modifying the cherry-picked upstream commits.
@bhintrapreneur
Copy link
Copy Markdown
Owner Author

Reopening to trigger fork CI after Actions were enabled for this new fork.

Move the test-only accessor before the teardown test module and remove the now-unfulfilled lint expectation. This is a CI hygiene patch only; it does not change runtime behavior or extend the V1 fork beyond the P0-A1 bridge scope.
Per revised plan §SDK Function Signatures: add Client::close(timeout) half-closed socket handling plus unsubscribe_last_trade_price and unsubscribe_best_bid_ask aliases. Uses a narrow internal close control message because the connection task owns the WebSocket stream; no Issue Polymarket#327 reconnect behavior or long-term V1 hardening is included.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9a7667a2dd

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/ws/connection.rs
match message {
Ok(Message::Close(_)) => return Ok(()),
Ok(_) => {}
Err(e) => return Err(Error::from(e)),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Treat normal close-handshake completion as success

close_stream converts every read-side tungstenite error into a hard failure, but a graceful WebSocket shutdown can surface as ConnectionClosed once both close frames are exchanged. In that common case, Client::close can return an error even though the peer completed the close handshake correctly, which makes shutdown reporting unreliable for healthy connections that close promptly.

Useful? React with 👍 / 👎.

Comment thread src/clob/ws/client.rs
Comment on lines +626 to +633
let channels: Vec<(
ChannelType,
ConnectionManager<WsMessage, Arc<InterestTracker>>,
)> = self
.channels
.iter()
.map(|entry| (*entry.key(), entry.value().connection.clone()))
.collect();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Prevent channels from being created during close

ClientInner::close takes a one-time snapshot of self.channels and only closes/removes those entries. Because subscription APIs can concurrently call get_or_create_channel, a new channel inserted after this snapshot is never visited, so close() may return success while leaving a live WebSocket channel running. This appears in multi-task usage where one task shuts down while another subscribes.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants