Table union porting: change merge credit tracking strategy and return leftovers #554

dcoutts · 2025-01-28T16:25:18Z

More porting of changes for table union from the prototype into the implementation: when supplying credits to a MergingRun we want to return the leftover/unused credits. This is needed because supplying of credits to a union merge will require that all credits get spent, but union merges are made up of many ongoing merging runs.

This turns out to be more tricky than at first appearance. The existing strategy for tracking credits (unspent, spent and merge steps performed) made it very difficult to figure out how to account leftover credits.

So the major change in this PR is to change the MergingRun credit tracking strategy

The intention is to simplify things and make them more obviously correct in the presence of concurrency. This also make it easier to reliably determine leftover/excess supplied credits.

The approach is to change the counters from three independent counters, to just two, which are modified together as a pair atomically. Previously we tracked the credits spent and unspent, and the steps performed. We did not explicitly keep track of credits that were in the process of being spent.

Now we track spent and unspent (and not steps performed), but the spent credits includes those that are in the process of being spent. We keep these together in a single atomic variable, and so all operations on the pair are atomic. This makes the concurrency story much simpler because all credit tracking changes are atomic. We avoid having to track steps performed by accounting differently for the difference between credits used for merging and steps performed: we simply borrow more credits from the unspent pot, allowing the pot to become negative.

dcoutts · 2025-01-28T16:51:02Z

Apparently I've broken something in snapshots. I'll have to narrow that down and not break it!

src/Database/LSMTree/Internal/Snapshot/Codec.hs

mheinzel · 2025-01-28T20:15:26Z

src/Database/LSMTree/Internal/Snapshot.hs

+    -- We need to know how many credits were spend and yet unspent so we can
+    -- restore merge work on snapshot load. No need to snapshot the contents
+    -- of totalStepsVar here, since we still start counting from 0 again when
+    -- loading the snapshot.
+    unspentCredits <- MR.readUnspentCredits mergeUnspentCreditsVar
+    spentCredits   <- MR.readSpentCredits mergeSpentCreditsVar


Since you are simplifying/refactoring this code, I am wondering why we need two separate fields in the snapshot at all. The important bit is how many credits have been supplied, we don't care how many of those are spent/unspent. And in fact, when loading the snapshot, we immediately sum those two.
I might have asked this already at some point @jorisdral

Fair point.

@mheinzel I'm not sure if we discussed this before, but yeah might be nice to reduce the number of fields in the snapshot structure

Note: there is an alternative to the current snapshot approach, which would be "nice to have", though we won't have time to implement it. In the alternative approach, we do not only snapshot runs, but also MergingRuns and related structures so that we do not have to restore merge progress on openSnapshot. If we ever implement this, we'd probably have to separate out most these fields again

That's true. We can leave them separate.

mheinzel · 2025-01-28T20:18:36Z

src/Database/LSMTree/Internal/MergingRun.hs

+  Compared to the credit tracking in the prototype: firstly we split credits
+  supplied into credits supplied and spent vs credits supplied but as yet
+  unspent. This is because we want to perform merging work in batches. So we
+  accumulate unspent credits until they reach a threshold at which point we do
+  a batch of merging work.


We also do this in the prototype, see MergeDebt.

mheinzel · 2025-01-28T21:22:04Z

src/Database/LSMTree/Internal/MergingRun.hs

          bracketOnError
-            (tryTakeUnspentCredits mergeUnspentCredits creditsThresh (Credits unspentCredits'))
-            (mapM_ (putBackUnspentCredits mergeUnspentCredits)) $ \case
+            (tryTakeUnspentCredits mergeUnspentCreditsVar creditsThresh (Credits unspentCredits'))
+            (mapM_ (putBackUnspentCredits mergeUnspentCreditsVar)) $ \case
              Nothing -> pure False
-              Just c' -> stepMerge mergeState mergeStepsPerformed c'
+              Just c' -> stepMerge mergeSpentCreditsVar mergeStepsPerformedVar
+                                   mergeState c'


We will also have to define the remaining debt of a merge, but this gets quite tricky with concurrency if we want similar properties to hold as in the prototype.

For example, it's reasonable to expect that the debt never increases and that if one supplies N credits, the debt decreases by at least N. However, stepsPerformed + unspentCredits' doesn't just increase, it can also temporarily decrease when a thread takes some unspent credits but needs some time before increasing the steps performed (or fails and puts back the unspent credits). So if we decide that the debt should be numEntries - (stepsPerformed - unspentCredits), that can increase. And if we just look at numEntries - stepsPerformed(which will never decrease), supplying credits might not reduce that at all, they could just get accumulated intounspentCredits`.

So we might need to either

keep track of a separate monotone counter of supplied credits that doesn't go down when a thread decides to do some work.

only take out unspent credits exactly when we add to steps performed. This would require using another mechanism to make sure only one thread thinks it should do a batch of merging work, i.e. some kind of lock. This requires a bigger change, but we were already considering something like that in the past, so it might have other benefits as well.

If a less lock-free design makes these types of use cases simpler, we should probably investigate that. Adding another counter sounds like it might be more complicated, though I'm also not opposed to it. If we decide to rely more on locks and less on atomic variables, then I think the important thing to preserve is that threads don't wait on the same lock to do merging work. Presumably, threads should only try to take a lock, and if they don't succeed, they continue supplying credits to other levels

Yeah this is really tricky and bound up with the concurrency issues.

dcoutts · 2025-01-31T10:41:35Z

Ok, I've updated this to just include the refactoring but not the actual change to return the leftover credits. That can be a follow-up.

mheinzel

Looks good!

Stale now that I've merged in a major change to this PR. Need re-review.

dcoutts · 2025-02-04T15:47:54Z

I've now merged #561 into this PR, and squashed out the unnecessary intermediate changes (which were all to the old scheme).

mheinzel

Nothing really blocking, but a few suggestions.

src/Database/LSMTree/Internal/MergingRun.hs

mheinzel

Looks good to me, just tiny remarks.

test/Test/Database/LSMTree/Internal/MergingRun.hs

src/Database/LSMTree/Internal/MergingRun.hs

The MergingRun contains the other two credit tracking vars. It makes more sense for all three to live together in once place. And there's no need for one of the vars to disappear when we move from the OngoingMerge state to the CompletedMerge state.

Export existing useful commentary as haddock docs, and extend them slightly. Also rearrange the export lists to improve docs. And reorder some declarations to keep credit things together.

The intention is to simplify things and make them more obviously correct in the presence of concurrency. This should have the bonus of making it easier to reliably determine leftover/excess supplied credits (which is something we want for supplying credits to trees of merging runs). The approach is to change the counters from three independent counters, to just two, which are modified together as a pair atomically. Previously we tracked the credits spent and unspent, and the steps performed. We did not explicitly keep track of credits that were in the process of being spent. Now we track spent and unspent (and not steps performed), but the spent credits includes those that are in the process of being spent. We keep these together in a single atomic variable, and so all operations on the pair are atomic. This makes the concurrency story much simpler because all credit tracking changes are atomic. We avoid having to track steps performed by accounting differently for the difference between credits used for merging and steps performed: we simply borrow more credits from the unspent pot, allowing the pot to become negative.

Provide a couple accessor functions and hide the rest (except the constructor needed for deriving Generic in the tests).

This is now easy because it's reported as part of the credit accounting in a reliable way.

And check max run size (2^40-1, or about 1 trillion)

jorisdral

LGTM! Just some minor comments and clarifications

src/Database/LSMTree/Internal/MergingRun.hs

src/Database/LSMTree/Internal/Snapshot.hs

src/Database/LSMTree/Internal/MergingRun.hs

test/Test/Database/LSMTree/Internal/MergingRun.hs

src/Database/LSMTree/Internal/MergingRun.hs

dcoutts · 2025-02-13T11:48:36Z

LGTM! Just some minor comments and clarifications

Addressed in #572

Follow-up from review on PR #554

dcoutts requested review from jorisdral, mheinzel, recursion-ninja and wenkokke as code owners January 28, 2025 16:25

jorisdral reviewed Jan 28, 2025

View reviewed changes

src/Database/LSMTree/Internal/Snapshot/Codec.hs Show resolved Hide resolved

mheinzel reviewed Jan 28, 2025

View reviewed changes

dcoutts force-pushed the dcoutts/merging-tree branch 2 times, most recently from 0e041d4 to 492094a Compare January 31, 2025 10:37

dcoutts changed the title ~~Table union porting: return leftovers when supplying credits~~ Table union porting: refactorin prior to return leftovers when supplying credits Jan 31, 2025

dcoutts changed the title ~~Table union porting: refactorin prior to return leftovers when supplying credits~~ Table union porting: refactoring prior to return leftovers when supplying credits Jan 31, 2025

mheinzel previously approved these changes Feb 3, 2025

View reviewed changes

dcoutts force-pushed the dcoutts/merging-tree branch from 78116b6 to 7d32a39 Compare February 4, 2025 15:41

dcoutts changed the title ~~Table union porting: refactoring prior to return leftovers when supplying credits~~ Table union porting: change merge credit tracking strategy and return leftovers Feb 4, 2025

dcoutts requested a review from mheinzel February 4, 2025 15:47

mheinzel approved these changes Feb 4, 2025

View reviewed changes

dcoutts force-pushed the dcoutts/merging-tree branch 2 times, most recently from c1be2c8 to f74e017 Compare February 7, 2025 10:51

mheinzel approved these changes Feb 7, 2025

View reviewed changes

test/Test/Database/LSMTree/Internal/MergingRun.hs Outdated Show resolved Hide resolved

src/Database/LSMTree/Internal/MergingRun.hs Outdated Show resolved Hide resolved

dcoutts force-pushed the dcoutts/merging-tree branch from f74e017 to 48f47ad Compare February 7, 2025 14:28

dcoutts enabled auto-merge February 7, 2025 14:28

dcoutts added this pull request to the merge queue Feb 7, 2025

mheinzel reviewed Feb 7, 2025

View reviewed changes

src/Database/LSMTree/Internal/MergingRun.hs Outdated Show resolved Hide resolved

dcoutts removed this pull request from the merge queue due to a manual request Feb 7, 2025

dcoutts force-pushed the dcoutts/merging-tree branch from 48f47ad to ded9fc5 Compare February 7, 2025 15:29

dcoutts enabled auto-merge February 7, 2025 15:30

dcoutts added 5 commits February 10, 2025 15:56

Improve documentation of credit tracking in MergingRun

bf00821

Export existing useful commentary as haddock docs, and extend them slightly. Also rearrange the export lists to improve docs. And reorder some declarations to keep credit things together.

Better encapsulate MergingRun internals

819f1d3

Provide a couple accessor functions and hide the rest (except the constructor needed for deriving Generic in the tests).

Return leftover credits from supplyCredits

be07f94

This is now easy because it's reported as part of the credit accounting in a reliable way.

Add tests for min/max bounds on (un)spent credits

74d9048

And check max run size (2^40-1, or about 1 trillion)

dcoutts force-pushed the dcoutts/merging-tree branch from ded9fc5 to 74d9048 Compare February 10, 2025 16:01

mheinzel mentioned this pull request Feb 10, 2025

The credit scaling in the impl was backwards from the prototype #565

Merged

dcoutts added this pull request to the merge queue Feb 10, 2025

Merged via the queue into main with commit 50da856 Feb 10, 2025
27 checks passed

dcoutts deleted the dcoutts/merging-tree branch February 10, 2025 23:37

jorisdral reviewed Feb 12, 2025

View reviewed changes

dcoutts mentioned this pull request Feb 13, 2025

Follow-up from review on PR #554 #572

Merged

github-merge-queue bot pushed a commit that referenced this pull request Feb 13, 2025

Merge pull request #572 from IntersectMBO/dcoutts/pr-554-review

bd366da

Follow-up from review on PR #554

Table union porting: change merge credit tracking strategy and return leftovers #554

Table union porting: change merge credit tracking strategy and return leftovers #554

Uh oh!

Conversation

dcoutts commented Jan 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dcoutts commented Jan 28, 2025

Uh oh!

Uh oh!

mheinzel Jan 28, 2025

Choose a reason for hiding this comment

Uh oh!

dcoutts Jan 29, 2025

Choose a reason for hiding this comment

Uh oh!

jorisdral Jan 29, 2025

Choose a reason for hiding this comment

Uh oh!

mheinzel Feb 3, 2025

Choose a reason for hiding this comment

Uh oh!

mheinzel Jan 28, 2025

Choose a reason for hiding this comment

Uh oh!

mheinzel Jan 28, 2025

Choose a reason for hiding this comment

Uh oh!

jorisdral Jan 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dcoutts Jan 29, 2025

Choose a reason for hiding this comment

Uh oh!

dcoutts commented Jan 31, 2025

Uh oh!

mheinzel left a comment

Choose a reason for hiding this comment

Uh oh!

dcoutts commented Feb 4, 2025

Uh oh!

mheinzel left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mheinzel left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jorisdral left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dcoutts commented Feb 13, 2025

Uh oh!

Uh oh!

dcoutts commented Jan 28, 2025 •

edited

Loading

jorisdral Jan 29, 2025 •

edited

Loading