Skip to content

Conversation

@benjamin-stacks
Copy link
Contributor

Description

Applicable issues

Additional info (benefits, drawbacks, caveats)

Checklist

  • Test coverage for new or modified code paths
  • Changelog is updated
  • Required documentation changes (e.g., docs/rpc/openapi.yaml and rpc-endpoints.md for v2 endpoints, event-dispatcher.md for new events)
  • New clarity functions have corresponding PR in clarity-benchmarking repo

@@ -0,0 +1,322 @@
use std::path::PathBuf;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey! Please add the copyright header to each source file. Thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good call, will do. Is "Stacks Open Internet Foundation" still the correct copyright holder?

Also, there's a whole bunch of files that lack that header, I'm wondering if there's a good way to automate this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@codecov
Copy link

codecov bot commented Dec 16, 2025

Codecov Report

❌ Patch coverage is 89.25477% with 62 lines in your changes missing coverage. Please review.
✅ Project coverage is 78.07%. Comparing base (4f535d8) to head (b856441).

Files with missing lines Patch % Lines
stacks-node/src/event_dispatcher.rs 74.68% 20 Missing ⚠️
stacks-node/src/event_dispatcher/worker.rs 90.85% 15 Missing ⚠️
stacks-node/src/main.rs 0.00% 8 Missing ⚠️
stacks-node/src/tests/neon_integrations.rs 56.25% 7 Missing ⚠️
stacks-node/src/run_loop/nakamoto.rs 25.00% 6 Missing ⚠️
stacks-node/src/event_dispatcher/db.rs 97.66% 5 Missing ⚠️
stacks-node/src/run_loop/boot_nakamoto.rs 50.00% 1 Missing ⚠️

❌ Your project check has failed because the head coverage (78.07%) is below the target coverage (80.00%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #6762      +/-   ##
===========================================
+ Coverage    77.77%   78.07%   +0.29%     
===========================================
  Files          585      586       +1     
  Lines       361916   362294     +378     
===========================================
+ Hits        281498   282855    +1357     
+ Misses       80418    79439     -979     
Files with missing lines Coverage Δ
stacks-node/src/event_dispatcher/tests.rs 95.83% <100.00%> (-2.88%) ⬇️
stacks-node/src/node.rs 86.40% <ø> (-0.02%) ⬇️
stacks-node/src/run_loop/mod.rs 89.65% <100.00%> (ø)
stacks-node/src/run_loop/neon.rs 82.35% <ø> (-1.27%) ⬇️
stacks-node/src/run_loop/boot_nakamoto.rs 77.55% <50.00%> (-2.59%) ⬇️
stacks-node/src/event_dispatcher/db.rs 97.31% <97.66%> (+4.09%) ⬆️
stacks-node/src/run_loop/nakamoto.rs 82.51% <25.00%> (-1.69%) ⬇️
stacks-node/src/tests/neon_integrations.rs 28.29% <56.25%> (+12.08%) ⬆️
stacks-node/src/main.rs 0.00% <0.00%> (ø)
stacks-node/src/event_dispatcher/worker.rs 90.85% <90.85%> (ø)
... and 1 more

... and 87 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4f535d8...b856441. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

... instead of using unnamed tuples and long parameter lists.
This will allow us to output warnings if the (non-blocking) delivery
gets too far behind, because we can tell how long it took between
enqueuing the event and actually sending it.

This commit adds another migration to said database, so I slightly
refactored the migration code.
@benjamin-stacks benjamin-stacks force-pushed the feat/non-blocking-event-delivery branch from 98b3395 to 0d62bd4 Compare December 29, 2025 14:34
This commit is the main implementation work for stacks-network#6543. It moves event
dispatcher HTTP requests to a separate thread. That way, a slow event
observer doesn't block the node from continuing its work.

Only if your event observers are so slow that the node is continuously
producing events faster than they can be delivered, will it eventually
start blocking again, because the queue size for pending requests is
bounded (at 1,000 right now, but I picked that number out of a hat,
happy to change it if anyone has thoughts).

Each new event payload is stored in the event observer DB, and its
ID is then sent to the subthread, which will make the request and then
delete the DB entry.

That way, if a node is shut down while there are pending requests,
they're in the DB ready to be retried after restart via
`process_pending_payloads()` (which blocks until completion). So that's
exactly as before (except that previously there couldn't have been more
than one or two pending payloads).
This fixes [this integration test failure](https://github.com/stacks-network/stacks-core/actions/runs/20749024845/job/59577684952?pr=6762),
caused by the fact that event delivery wasn't complete by the time the
assertions were made.
Doing this work in the RunLoop implementations' startup code is *almost*
the same thing, but not quite, since the nakamoto run loop might be
started later (after an epoch 3 transition), at which point the event DB
may already have new items from the current run of the application,
which should *not* be touched by `process_pending_payloads`.

This used to not be a problem, but now that that DB is used for the
actual queue of the (concurrently running) EventDispatcherWorker, it has
become one.
This is like 72437b2, but it works for
all the tests instead of only the one.

While only that one test very obviously failed, the issue exists for
pretty much all of the integration tests, because they rely on the
test_observer to capture all relevant data.

Things are fast enough, and therefore we've only seen one blatant
failure, but

1) it's going to be flaky (I can create a whole lot of test failures
   by adding a small artificial delay to event delivery), and
2) it might actually be *hiding* test failures (in some cases, like e.g.
   neon_integrations::deep_contract, we're asserting that certain things
   are *not* in the data, and if the data is incomplete to begin with,
   those assertions are moot).
@benjamin-stacks benjamin-stacks force-pushed the feat/non-blocking-event-delivery branch from 478efa3 to d5fa2fc Compare January 8, 2026 17:29
When switching runloops at the epoch 2/3 transition, this ensures that
the same event dispatcher worker thread is handling delivery, which in
turn ensures that all payloads are delivered in order
As per this thread: stacks-network#6795 (review)

I used the same table/column name and semantics that we use elsewhere
for the same purposes.

Also fixed a comment typo.
Not sure why this wasn't caught in the pre-commit hook, I'd have assumed
the checks are the same.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants