Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Table union porting: change merge credit tracking strategy and return leftovers #554

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

dcoutts
Copy link
Collaborator

@dcoutts dcoutts commented Jan 28, 2025

More porting of changes for table union from the prototype into the implementation: when supplying credits to a MergingRun we want to return the leftover/unused credits. This is needed because supplying of credits to a union merge will require that all credits get spent, but union merges are made up of many ongoing merging runs.

This turns out to be more tricky than at first appearance. The existing strategy for tracking credits (unspent, spent and merge steps performed) made it very difficult to figure out how to account leftover credits.

So the major change in this PR is to change the MergingRun credit tracking strategy

The intention is to simplify things and make them more obviously correct in the presence of concurrency. This also make it easier to reliably determine leftover/excess supplied credits.

The approach is to change the counters from three independent counters, to just two, which are modified together as a pair atomically. Previously we tracked the credits spent and unspent, and the steps performed. We did not explicitly keep track of credits that were in the process of being spent.

Now we track spent and unspent (and not steps performed), but the spent credits includes those that are in the process of being spent. We keep these together in a single atomic variable, and so all operations on the pair are atomic. This makes the concurrency story much simpler because all credit tracking changes are atomic. We avoid having to track steps performed by accounting differently for the difference between credits used for merging and steps performed: we simply borrow more credits from the unspent pot, allowing the pot to become negative.

@dcoutts
Copy link
Collaborator Author

dcoutts commented Jan 28, 2025

Apparently I've broken something in snapshots. I'll have to narrow that down and not break it!

Comment on lines 199 to 207
-- We need to know how many credits were spend and yet unspent so we can
-- restore merge work on snapshot load. No need to snapshot the contents
-- of totalStepsVar here, since we still start counting from 0 again when
-- loading the snapshot.
unspentCredits <- MR.readUnspentCredits mergeUnspentCreditsVar
spentCredits <- MR.readSpentCredits mergeSpentCreditsVar
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you are simplifying/refactoring this code, I am wondering why we need two separate fields in the snapshot at all. The important bit is how many credits have been supplied, we don't care how many of those are spent/unspent. And in fact, when loading the snapshot, we immediately sum those two.
I might have asked this already at some point @jorisdral

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mheinzel I'm not sure if we discussed this before, but yeah might be nice to reduce the number of fields in the snapshot structure

Note: there is an alternative to the current snapshot approach, which would be "nice to have", though we won't have time to implement it. In the alternative approach, we do not only snapshot runs, but also MergingRuns and related structures so that we do not have to restore merge progress on openSnapshot. If we ever implement this, we'd probably have to separate out most these fields again

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true. We can leave them separate.

Comment on lines 235 to 239
Compared to the credit tracking in the prototype: firstly we split credits
supplied into credits supplied and spent vs credits supplied but as yet
unspent. This is because we want to perform merging work in batches. So we
accumulate unspent credits until they reach a threshold at which point we do
a batch of merging work.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also do this in the prototype, see MergeDebt.

Comment on lines 392 to 405
bracketOnError
(tryTakeUnspentCredits mergeUnspentCredits creditsThresh (Credits unspentCredits'))
(mapM_ (putBackUnspentCredits mergeUnspentCredits)) $ \case
(tryTakeUnspentCredits mergeUnspentCreditsVar creditsThresh (Credits unspentCredits'))
(mapM_ (putBackUnspentCredits mergeUnspentCreditsVar)) $ \case
Nothing -> pure False
Just c' -> stepMerge mergeState mergeStepsPerformed c'
Just c' -> stepMerge mergeSpentCreditsVar mergeStepsPerformedVar
mergeState c'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will also have to define the remaining debt of a merge, but this gets quite tricky with concurrency if we want similar properties to hold as in the prototype.

For example, it's reasonable to expect that the debt never increases and that if one supplies N credits, the debt decreases by at least N. However, stepsPerformed + unspentCredits' doesn't just increase, it can also temporarily decrease when a thread takes some unspent credits but needs some time before increasing the steps performed (or fails and puts back the unspent credits). So if we decide that the debt should be numEntries - (stepsPerformed - unspentCredits), that can increase. And if we just look at numEntries - stepsPerformed(which will never decrease), supplying credits might not reduce that at all, they could just get accumulated intounspentCredits`.

So we might need to either

  1. keep track of a separate monotone counter of supplied credits that doesn't go down when a thread decides to do some work.
  2. only take out unspent credits exactly when we add to steps performed. This would require using another mechanism to make sure only one thread thinks it should do a batch of merging work, i.e. some kind of lock. This requires a bigger change, but we were already considering something like that in the past, so it might have other benefits as well.

Copy link
Collaborator

@jorisdral jorisdral Jan 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a less lock-free design makes these types of use cases simpler, we should probably investigate that. Adding another counter sounds like it might be more complicated, though I'm also not opposed to it. If we decide to rely more on locks and less on atomic variables, then I think the important thing to preserve is that threads don't wait on the same lock to do merging work. Presumably, threads should only try to take a lock, and if they don't succeed, they continue supplying credits to other levels

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this is really tricky and bound up with the concurrency issues.

@dcoutts dcoutts force-pushed the dcoutts/merging-tree branch 2 times, most recently from 0e041d4 to 492094a Compare January 31, 2025 10:37
@dcoutts dcoutts changed the title Table union porting: return leftovers when supplying credits Table union porting: refactorin prior to return leftovers when supplying credits Jan 31, 2025
@dcoutts dcoutts changed the title Table union porting: refactorin prior to return leftovers when supplying credits Table union porting: refactoring prior to return leftovers when supplying credits Jan 31, 2025
@dcoutts
Copy link
Collaborator Author

dcoutts commented Jan 31, 2025

Ok, I've updated this to just include the refactoring but not the actual change to return the leftover credits. That can be a follow-up.

mheinzel
mheinzel previously approved these changes Feb 3, 2025
Copy link
Collaborator

@mheinzel mheinzel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@dcoutts dcoutts force-pushed the dcoutts/merging-tree branch from 78116b6 to 7d32a39 Compare February 4, 2025 15:41
@dcoutts dcoutts changed the title Table union porting: refactoring prior to return leftovers when supplying credits Table union porting: change merge credit tracking strategy and return leftovers Feb 4, 2025
@dcoutts dcoutts dismissed mheinzel’s stale review February 4, 2025 15:46

Stale now that I've merged in a major change to this PR. Need re-review.

@dcoutts dcoutts requested a review from mheinzel February 4, 2025 15:47
@dcoutts
Copy link
Collaborator Author

dcoutts commented Feb 4, 2025

I've now merged #561 into this PR, and squashed out the unnecessary intermediate changes (which were all to the old scheme).

Copy link
Collaborator

@mheinzel mheinzel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing really blocking, but a few suggestions.

src/Database/LSMTree/Internal/MergingRun.hs Outdated Show resolved Hide resolved
src/Database/LSMTree/Internal/MergingRun.hs Outdated Show resolved Hide resolved
src/Database/LSMTree/Internal/MergingRun.hs Outdated Show resolved Hide resolved
src/Database/LSMTree/Internal/MergingRun.hs Show resolved Hide resolved
src/Database/LSMTree/Internal/MergingRun.hs Show resolved Hide resolved
src/Database/LSMTree/Internal/MergingRun.hs Outdated Show resolved Hide resolved
src/Database/LSMTree/Internal/MergingRun.hs Outdated Show resolved Hide resolved
src/Database/LSMTree/Internal/MergingRun.hs Show resolved Hide resolved
src/Database/LSMTree/Internal/MergingRun.hs Outdated Show resolved Hide resolved
src/Database/LSMTree/Internal/MergingRun.hs Outdated Show resolved Hide resolved
@dcoutts dcoutts force-pushed the dcoutts/merging-tree branch from 95ed45d to c1be2c8 Compare February 6, 2025 17:12
The MergingRun contains the other two credit tracking vars. It makes
more sense for all three to live together in once place. And there's no
need for one of the vars to disappear when we move from the OngoingMerge
state to the CompletedMerge state.
Export existing useful commentary as haddock docs, and extend them
slightly. Also rearrange the export lists to improve docs. And reorder
some declarations to keep credit things together.
@dcoutts dcoutts force-pushed the dcoutts/merging-tree branch from c1be2c8 to f74e017 Compare February 7, 2025 10:51
Copy link
Collaborator

@mheinzel mheinzel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, just tiny remarks.

test/Test/Database/LSMTree/Internal/MergingRun.hs Outdated Show resolved Hide resolved
src/Database/LSMTree/Internal/MergingRun.hs Outdated Show resolved Hide resolved
@dcoutts dcoutts force-pushed the dcoutts/merging-tree branch from f74e017 to 48f47ad Compare February 7, 2025 14:28
@dcoutts dcoutts enabled auto-merge February 7, 2025 14:28
@dcoutts dcoutts added this pull request to the merge queue Feb 7, 2025
@dcoutts dcoutts removed this pull request from the merge queue due to a manual request Feb 7, 2025
The intention is to simplify things and make them more obviously correct
in the presence of concurrency. This should have the bonus of making it
easier to reliably determine leftover/excess supplied credits (which is
something we want for supplying credits to trees of merging runs).

The approach is to change the counters from three independent counters,
to just two, which are modified together as a pair atomically.
Previously we tracked the credits spent and unspent, and the steps
performed. We did not explicitly keep track of credits that were in the
process of being spent.

Now we track spent and unspent (and not steps performed), but the spent
credits includes those that are in the process of being spent. We keep
these together in a single atomic variable, and so all operations on the
pair are atomic. This makes the concurrency story much simpler because
all credit tracking changes are atomic. We avoid having to track steps
performed by accounting differently for the difference between credits
used for merging and steps performed: we simply borrow more credits from
the unspent pot, allowing the pot to become negative.
Provide a couple accessor functions and hide the rest (except the
constructor needed for deriving Generic in the tests).
This is now easy because it's reported as part of the credit accounting
in a reliable way.
And check max run size (2^40-1, or about 1 trillion)
@dcoutts dcoutts force-pushed the dcoutts/merging-tree branch from 48f47ad to ded9fc5 Compare February 7, 2025 15:29
@dcoutts dcoutts enabled auto-merge February 7, 2025 15:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants