sv2-tp: reconnect to bitcoin core IPC after disconnect#101
Open
enirox001 wants to merge 5 commits intostratum-mining:masterfrom
Open
sv2-tp: reconnect to bitcoin core IPC after disconnect#101enirox001 wants to merge 5 commits intostratum-mining:masterfrom
enirox001 wants to merge 5 commits intostratum-mining:masterfrom
Conversation
20b5772 to
79e0653
Compare
Client proxy destruction can race with connection teardown. When a remote disconnect begins, a proxy may still hold a non-null connection pointer before synchronous cleanup callbacks have cleared client state. In that window, calling the generated destroy() method can raise an IPC disconnect exception during ordinary proxy destruction. Add an explicit connection teardown flag and skip best-effort remote destroy calls once disconnect handling has started.
Add a reconnect loop around the initial Bitcoin Core IPC setup. When the IPC connection cannot be established, retry with exponential backoff instead of exiting immediately. This provides the basis for recovering sv2-tp after backend loss.
Decouple the template provider lifetime from the Bitcoin Core IPC backend. Keep the Stratum v2 listener and connected clients alive when the backend disconnects, wait for a replacement backend, and resume serving templates once a new IPC connection is installed.
Adapt the sv2 template provider tests to the reconnect lifecycle. Construct the provider without a fixed Mining reference and install the backend through the new reconnect path so the test harness matches the runtime behavior.
Simplify the reconnect implementation now that disconnected proxy teardown is handled in the IPC layer. Remove the local teardown workarounds, restore ordinary backend ownership, and harden the remaining shutdown path so reconnect and operator shutdown both complete cleanly.
79e0653 to
8f0ba30
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR makes sv2-tp recover from a lost ipc connection without dropping the process.
The first commit fixes a libmultiprocess teardown race during client proxy destruction. A proxy can still hold a non-null
Connection*after disconnect handling has started, and calling the generateddestroy()method in that window can raise an IPC disconnect exception during ordinary destruction. Mark the connection as disconnected as soon as teardown begins and skip best-effort remote destroy calls once that state is reached.With that in place, sv2-tp can reconnect cleanly in-process. Keep the template provider and pool-facing connection manager alive across backend loss, reconnect to Bitcoin Core with backoff, install a fresh IPC backend, and resume serving templates on the existing pool connection.
The final cleanup commit removes the reconnect-time workarounds that were only needed before the IPC teardown fix. Backend ownership returns to normal unique_ptr use, and shutdown returns to the normal process path.
It also adds explicit exception handling around IPC method calls (like
waitNextandsubmitSolution) to ensure the template provider gracefully traps disconnect errors without crashing