-
Notifications
You must be signed in to change notification settings - Fork 6.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Track the total number of compaction sorted runs inside BackgroundCallCompaction #13320
base: main
Are you sure you want to change the base?
Track the total number of compaction sorted runs inside BackgroundCallCompaction #13320
Conversation
5e67c21
to
6a71c06
Compare
@@ -2963,6 +2966,10 @@ class DBImpl : public DB { | |||
// stores the number of compactions are currently running | |||
int num_running_compactions_; | |||
|
|||
// stores the number of input iterators required for currently running | |||
// compactions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uint64_t
or size_t
makes more sense to me since num_running_compaction_input_iterators_
should not be negative. However, I wanted to follow the convention here since everything else is using int
db/db_impl/db_impl.h
Outdated
@@ -2963,6 +2966,10 @@ class DBImpl : public DB { | |||
// stores the number of compactions are currently running | |||
int num_running_compactions_; | |||
|
|||
// stores the number of input iterators required for currently running | |||
// compactions | |||
int num_running_compaction_input_iterators_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see we have InstrumentedMutexLock l(&mutex_);
protecting num_running_compactions_
as well as num_running_flushes_
.
I guess that means we do not need std::atomic<int>
for num_running_compaction_input_iterators_
@archang19 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@archang19 has updated the pull request. You must reimport the pull request before landing. |
@archang19 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
db/db_impl/db_impl.h
Outdated
@@ -957,6 +957,13 @@ class DBImpl : public DB { | |||
return num_running_compactions_; | |||
} | |||
|
|||
// Returns the number of input iterators for currently running compactions. | |||
// REQUIREMENT: mutex_ must be held when calling this function. | |||
int num_running_compaction_input_iterators() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed num_running_compactions
did not follow the example of num_running_flushes
, where db->num_running_flushes()
gets called inside HandleNumRunningFlushes
. HandleNumRunningCompactions
was just accessing num_running_compactions_
directly.
@archang19 has updated the pull request. You must reimport the pull request before landing. |
@archang19 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@cbi42 I added you since this touches compaction code. Perhaps you know a better way of doing this or even another way to obtain the same information |
@@ -61,6 +61,18 @@ bool DBImpl::EnoughRoomForCompaction( | |||
return enough_room; | |||
} | |||
|
|||
size_t DBImpl::GetNumberCompactionInputIterators(Compaction* c) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should check for and skip trivial moves and deletion compactions since they won't have any input iterators
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, good catch. TIL about "deletion compactions"
db/internal_stats.cc
Outdated
@@ -1261,7 +1269,13 @@ bool InternalStats::HandleCompactionPending(uint64_t* value, DBImpl* /*db*/, | |||
|
|||
bool InternalStats::HandleNumRunningCompactions(uint64_t* value, DBImpl* db, | |||
Version* /*version*/) { | |||
*value = db->num_running_compactions_; | |||
*value = db->num_running_compactions(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't num_running_compactions()
assert that DB mutex is held?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, it does. I think (?) that is what we want. num_running_flushes()
also checks mutex_.AssertHeld();
and is called from HandleNumRunningFlushes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I saw your other comment. I will update this PR to just read off the values.
db/internal_stats.cc
Outdated
|
||
bool InternalStats::HandleNumRunningCompactionInputIterators( | ||
uint64_t* value, DBImpl* db, Version* /*version*/) { | ||
*value = db->num_running_compaction_input_iterators(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're not holding the mutex here. Since these are stats, it should be ok to directly read the counter even if its not 100% accurate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I also get rid of num_running_flushes()
, which asserts the mutex is held?
db/db_impl/db_impl.h
Outdated
Status BackgroundCompaction(bool* madeProgress, JobContext* job_context, | ||
LogBuffer* log_buffer, | ||
Status BackgroundCompaction(bool* madeProgress, | ||
int& num_compaction_iterators_added, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
C++ style guide suggests that output parameters go at the end
I think the name "input iterator" can be confusing, especially since this will be a public API. How about using "sorted runs" instead? I think subcompactions can affect this since each subcompaction creates its own iterator. |
@archang19 has updated the pull request. You must reimport the pull request before landing. |
I think that name works. I will update this PR
@cbi42 do you mean that the actual number of iterators (sorted runs) would be more than the number of L0 files + the number of non-L0 levels when there are subcompactions? |
@archang19 has updated the pull request. You must reimport the pull request before landing. |
1 similar comment
@archang19 has updated the pull request. You must reimport the pull request before landing. |
1231ef7
to
21fa2c2
Compare
@archang19 has updated the pull request. You must reimport the pull request before landing. |
@archang19 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@archang19 has updated the pull request. You must reimport the pull request before landing. |
@archang19 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Yes, each subcompaction creates input iterators separately. |
@archang19 has updated the pull request. You must reimport the pull request before landing. |
@anand1976 do you think this subcompaction exception will be problematic for our purposes? I searched around and found a |
@archang19 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@@ -3300,11 +3300,14 @@ TEST_P(DBCompactionTestWithParam, CompressLevelCompaction) { | |||
GenerateNewFile(&rnd, &key_idx); | |||
ASSERT_EQ("1,4,8", FilesPerLevel(0)); | |||
|
|||
ASSERT_EQ(matches, 12); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cbi42 the check for IsTrivialMove
inside my GetNumberCompactionSortedRuns
method messes up the count for InputCompressionMatchesOutput
. I did not dig into why the original test values were the way they are, but I do not think my PR would break this functionality. For those who do understand, I updated the expected value to be the original expected value plus what I determined was added by GetNumberCompactionSortedRuns
.
Not sure if you had to work with this test case before. Let me know if this is problematic or the comments can be reworded
Would it be easier to maintain this count inside each compaction job / subcompaction? Similar to how |
My guess is it would be problematic. To be on the safe side, like @cbi42 suggested we could maintain this count inside each compaction job/subcompaction. Or updating it in |
thanks for the comments @cbi42 @anand1976. I am following your suggestion and am working on another PR #13325 that updates the stats inside |
Going to try and opt for #13325 instead