Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new db property to track number of compaction sorted runs #13367

Conversation

archang19
Copy link
Contributor

@archang19 archang19 commented Feb 3, 2025

Summary

Yes, this is another attempt at recording the number of sorted runs. Hopefully this one is the last...

This PR adds a new statistic to track the total number of sorted runs for running compactions.

Context: I am currently working on a separate project, where I am trying to tune the read request sizes made by FilePrefetchBuffer to the storage backend. In this particular case, FilePrefetchBuffer will issue larger reads and have to buffer larger read responses. This means we expect to see higher memory utilization. At least for the initial rollout, we only want to enable this optimization for compaction reads.

I want some way to get a sense of what the memory usage impact will be if the prefetch read request size is increased from (for instance) 8MB to 64MB.

If I know the number of files that compactions are actively reading from (i.e. the number of sorted runs / "input iterators"), I can determine how much the memory usage will increase if I bump up the readahead size inside FilePrefetchBuffer. For instance, if there are 16 sorted runs at any given point in time and I bump up the readahead size by 64MB, I can project an increase of 16 * 64 MB.

In most cases, the number of sorted runs processed per compaction is the number of L0 files plus the number of non-L0 levels. However, we need to be aware of exceptions like trivial compactions, deletion compactions, and subcompactions. This is a major reason why this PR chooses to implement the stats counting inside CompactionMergingIterator, since by the time we get down to that part of the stack, we know the "true" values for the number of input iterators / sorted runs.

Alternatives considered:

Test Plan

  • I updated one unit test to confirm that num_running_compaction_sorted_runs starts and ends at 0. This checks that all the additions and subtractions cancel out. I also made sure the statistic got incremented at least once.
  • When I added fprintf manually, I confirmed that my statistics updating code was being exercised numerous times inside db_compaction_test. I printed out the results before and after the increments/decrements, and the numbers looked good.
  • We will monitor the generated statistics after this PR is merged.
  • There are assertion checks before each decrement.

@archang19 archang19 changed the title Compaction sorted runs stat notification Add new db property to track number of compaction sorted runs Feb 3, 2025
@archang19 archang19 force-pushed the compaction-sorted-runs-stat-notification branch from c2b4d60 to 06b71e2 Compare February 3, 2025 23:33
@archang19 archang19 marked this pull request as ready for review February 3, 2025 23:37
@facebook-github-bot
Copy link
Contributor

@archang19 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@archang19 archang19 requested a review from anand1976 February 3, 2025 23:39
@facebook-github-bot
Copy link
Contributor

@archang19 has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@archang19 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@archang19 has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@archang19 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Comment on lines +2237 to +2252
size_t DBStatsCallback::GetNumberCompactionSortedRuns(const Compaction* c) {
assert(c);
if (c->IsTrivialMove() || c->deletion_compaction()) {
return 0;
}
// Return the number of L0 files + number of non-L0 levels
if (c->start_level() == 0) {
assert(0 < c->num_input_levels());
assert(c->level(0) == 0);
size_t num_l0_files = c->num_input_files(0);
size_t num_non_l0_levels = c->num_input_levels() - 1;
return num_l0_files + num_non_l0_levels;
}
return c->num_input_levels();
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I briefly remembered you mentioned counting the input L0 files and non-L0 level doesn't work for c->IsTrivialMove() || c->deletion_compaction(). Did you decided to not counting for trivial move and deletion compaction then?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is this check at the top

  if (c->IsTrivialMove() || c->deletion_compaction()) {
    return 0;
  }

(1 or 2 PR variations back we decided not to count trivial moves and deletion compactions)

std::memory_order_relaxed);
}

size_t DBStatsCallback::GetNumberCompactionSortedRuns(const Compaction* c) {
Copy link
Contributor

@hx235 hx235 Feb 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not in favor of adding new class DBStatsCallback for doing something that existing usage of db_stats_ can do. If we are to make this a DB-level property, can we make it part of db_stats_ and allow decrement API? Also please fix the PR summary about the deadlock - I don't see that being true anymore.

If we are looking for cleaner code (which I support), then the efforts to make this new property part of db_stats_ should involve debugging why "assertions to fail inside the crash tests" (quoted from #13325 summary) if it pops up again for making it part of db_stats_

Regardless, I think that this statistic is different enough from the CF-specific and the other DB-wide stats that the best solution is to just have it defined as a separate std::atomic<uint64_t>

Quoted from #13325 summary. This stat isn't that different enough to need a new class DBStatsCallback I believe. We can easily add a decrement API for db_stats_.

For other concern "One possibility is that the values in cf_stats_value_ are being cleared to 0," (if shown up again for making this stat part of the db_stats), can you explain how this new stat is being cleared and why you didn't want it to be cleared?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

existing usage of db_stats_ can do.
db_stats_ gets cleared, which we do not want because we want a running sum of the total number of sorted runs, similar to num_running_compactions. We don't want this reset to 0 every interval (that would also make it likely to underflow below 0 on a decrement)

This stat isn't that different enough to need a new class DBStatsCallback
yeah, that is fair. I think the locking/deadlock concern is a non-issue so we can probably just go with the CompactionMergingIterator approach if we don't have a strong preference for this approach

Copy link
Contributor

@hx235 hx235 Feb 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

db_stats_ gets cleared, which we do not want because we want a running sum of the total number of sorted runs, similar to num_running_compactions. We don't want this reset to 0 every interval (that would also make it likely to underflow below 0 on a decrement)

I believe db_stats_ gets cleared only upon users/stress test call ResetStats() so rocksdb itself won't auto-clear that. From this perspective, if users explicitly ask for ResetStats(), i don't know why num_running_compactions should be exempted from that.

 // Reset internal stats for DB and all column families.
  // Note this doesn't reset options.statistics as it is not owned by
  // DB.
  virtual Status ResetStats() {
    return Status::NotSupported("Not implemented");
  }

Though by replying to your comment, I looked up the kDBStats public API. It's noted "both cumulative (over the db's lifetime) and interval (since the last retrieval of kDBStats)". So I believe the API indicates your new property should NOT fall under db_stats_. I should revise my comment "If we are looking for cleaner code (which I support), then the efforts to make this new property part of db_stats". It's okay for me to not add this to kDBStats.

For DBStatsCallback(), it's arguably better than the existing approach of just updating the variable like num_running_compactions or bg_error_count_. There is for sure room for improvement in updating properties like these e.g, principle of RAII that auto-decrements the variable with the amount previously incremented once going out of the scope. Regardless, ideally we should not have just one variable using DBStatsCallback() and others not using DBStatsCallback(). On this matter, I am still against using DBStatsCallback() in this PR but encourage you opening a tech debt task to revisit the way we do property update and see if a refractory of using something similar to DBStatsCallback() is needed.

  • Edit: I just realized there are too many ways of implementing db internal statistics/properties: bg_error_count_, db_stats_, num_running_compactions_ are all different!! It will be nice to have some consolidation at some point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regardless, ideally we should not have just one variable using DBStatsCallback() and others not using DBStatsCallback()

I agree, I also did not like this aspect of this implementation

I just realized there are too many ways of implementing db internal statistics/properties

Yes, agreed

Copy link
Contributor

@hx235 hx235 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@archang19
Copy link
Contributor Author

closing in favor of #13325

@archang19 archang19 closed this Feb 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants