-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Table union porting: change merge credit tracking strategy and return leftovers #554
base: main
Are you sure you want to change the base?
Conversation
Apparently I've broken something in snapshots. I'll have to narrow that down and not break it! |
-- We need to know how many credits were spend and yet unspent so we can | ||
-- restore merge work on snapshot load. No need to snapshot the contents | ||
-- of totalStepsVar here, since we still start counting from 0 again when | ||
-- loading the snapshot. | ||
unspentCredits <- MR.readUnspentCredits mergeUnspentCreditsVar | ||
spentCredits <- MR.readSpentCredits mergeSpentCreditsVar |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you are simplifying/refactoring this code, I am wondering why we need two separate fields in the snapshot at all. The important bit is how many credits have been supplied, we don't care how many of those are spent/unspent. And in fact, when loading the snapshot, we immediately sum those two.
I might have asked this already at some point @jorisdral
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mheinzel I'm not sure if we discussed this before, but yeah might be nice to reduce the number of fields in the snapshot structure
Note: there is an alternative to the current snapshot approach, which would be "nice to have", though we won't have time to implement it. In the alternative approach, we do not only snapshot runs, but also MergingRun
s and related structures so that we do not have to restore merge progress on openSnapshot
. If we ever implement this, we'd probably have to separate out most these fields again
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's true. We can leave them separate.
Compared to the credit tracking in the prototype: firstly we split credits | ||
supplied into credits supplied and spent vs credits supplied but as yet | ||
unspent. This is because we want to perform merging work in batches. So we | ||
accumulate unspent credits until they reach a threshold at which point we do | ||
a batch of merging work. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also do this in the prototype, see MergeDebt
.
bracketOnError | ||
(tryTakeUnspentCredits mergeUnspentCredits creditsThresh (Credits unspentCredits')) | ||
(mapM_ (putBackUnspentCredits mergeUnspentCredits)) $ \case | ||
(tryTakeUnspentCredits mergeUnspentCreditsVar creditsThresh (Credits unspentCredits')) | ||
(mapM_ (putBackUnspentCredits mergeUnspentCreditsVar)) $ \case | ||
Nothing -> pure False | ||
Just c' -> stepMerge mergeState mergeStepsPerformed c' | ||
Just c' -> stepMerge mergeSpentCreditsVar mergeStepsPerformedVar | ||
mergeState c' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We will also have to define the remaining debt of a merge, but this gets quite tricky with concurrency if we want similar properties to hold as in the prototype.
For example, it's reasonable to expect that the debt never increases and that if one supplies N credits, the debt decreases by at least N. However, stepsPerformed + unspentCredits' doesn't just increase, it can also temporarily decrease when a thread takes some unspent credits but needs some time before increasing the steps performed (or fails and puts back the unspent credits). So if we decide that the debt should be
numEntries - (stepsPerformed - unspentCredits), that can increase. And if we just look at
numEntries - stepsPerformed(which will never decrease), supplying credits might not reduce that at all, they could just get accumulated into
unspentCredits`.
So we might need to either
- keep track of a separate monotone counter of supplied credits that doesn't go down when a thread decides to do some work.
- only take out unspent credits exactly when we add to steps performed. This would require using another mechanism to make sure only one thread thinks it should do a batch of merging work, i.e. some kind of lock. This requires a bigger change, but we were already considering something like that in the past, so it might have other benefits as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a less lock-free design makes these types of use cases simpler, we should probably investigate that. Adding another counter sounds like it might be more complicated, though I'm also not opposed to it. If we decide to rely more on locks and less on atomic variables, then I think the important thing to preserve is that threads don't wait on the same lock to do merging work. Presumably, threads should only try to take a lock, and if they don't succeed, they continue supplying credits to other levels
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah this is really tricky and bound up with the concurrency issues.
0e041d4
to
492094a
Compare
Ok, I've updated this to just include the refactoring but not the actual change to return the leftover credits. That can be a follow-up. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
78116b6
to
7d32a39
Compare
Stale now that I've merged in a major change to this PR. Need re-review.
I've now merged #561 into this PR, and squashed out the unnecessary intermediate changes (which were all to the old scheme). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing really blocking, but a few suggestions.
95ed45d
to
c1be2c8
Compare
The MergingRun contains the other two credit tracking vars. It makes more sense for all three to live together in once place. And there's no need for one of the vars to disappear when we move from the OngoingMerge state to the CompletedMerge state.
Export existing useful commentary as haddock docs, and extend them slightly. Also rearrange the export lists to improve docs. And reorder some declarations to keep credit things together.
c1be2c8
to
f74e017
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, just tiny remarks.
f74e017
to
48f47ad
Compare
The intention is to simplify things and make them more obviously correct in the presence of concurrency. This should have the bonus of making it easier to reliably determine leftover/excess supplied credits (which is something we want for supplying credits to trees of merging runs). The approach is to change the counters from three independent counters, to just two, which are modified together as a pair atomically. Previously we tracked the credits spent and unspent, and the steps performed. We did not explicitly keep track of credits that were in the process of being spent. Now we track spent and unspent (and not steps performed), but the spent credits includes those that are in the process of being spent. We keep these together in a single atomic variable, and so all operations on the pair are atomic. This makes the concurrency story much simpler because all credit tracking changes are atomic. We avoid having to track steps performed by accounting differently for the difference between credits used for merging and steps performed: we simply borrow more credits from the unspent pot, allowing the pot to become negative.
Provide a couple accessor functions and hide the rest (except the constructor needed for deriving Generic in the tests).
This is now easy because it's reported as part of the credit accounting in a reliable way.
And check max run size (2^40-1, or about 1 trillion)
48f47ad
to
ded9fc5
Compare
More porting of changes for table union from the prototype into the implementation: when supplying credits to a MergingRun we want to return the leftover/unused credits. This is needed because supplying of credits to a union merge will require that all credits get spent, but union merges are made up of many ongoing merging runs.
This turns out to be more tricky than at first appearance. The existing strategy for tracking credits (unspent, spent and merge steps performed) made it very difficult to figure out how to account leftover credits.
So the major change in this PR is to change the MergingRun credit tracking strategy
The intention is to simplify things and make them more obviously correct in the presence of concurrency. This also make it easier to reliably determine leftover/excess supplied credits.
The approach is to change the counters from three independent counters, to just two, which are modified together as a pair atomically. Previously we tracked the credits spent and unspent, and the steps performed. We did not explicitly keep track of credits that were in the process of being spent.
Now we track spent and unspent (and not steps performed), but the spent credits includes those that are in the process of being spent. We keep these together in a single atomic variable, and so all operations on the pair are atomic. This makes the concurrency story much simpler because all credit tracking changes are atomic. We avoid having to track steps performed by accounting differently for the difference between credits used for merging and steps performed: we simply borrow more credits from the unspent pot, allowing the pot to become negative.