-
Notifications
You must be signed in to change notification settings - Fork 62
Feature: extend the partition algorithm to handle weights #1889
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Thanks again for this addition! Since 95% of our users do not use weighted partitioning but the classical one and the weighted does add overhead i would argue we should at least skip the weight computation loop in standard mode. |
That's an interesting idea. Maybe we can get the best of both worlds with this kind of "fast-forward" behavior in the simple case. I will look into that. |
|
I wonder what should be the input of the weight function. Currently it takes a forest, a local tree id and an element id (within that tree). |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1889 +/- ##
==========================================
+ Coverage 76.60% 76.65% +0.05%
==========================================
Files 109 109
Lines 18721 18766 +45
==========================================
+ Hits 14341 14385 +44
- Misses 4380 4381 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
I have refactored to use |
Co-authored-by: spenke91 <[email protected]>
|
Thanks a lot for your contribution @dutkalex ! I am out of office this week, but looking forward to reviewing it next week 👍 |
spenke91
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you so much @dutkalex for your contribuition! I like the new feature a lot and as discussed, it seems to work flawlessly. As you can see, I only have minor remarks, which are mostly on how to pass the weight function to the forest.
Don't hesitate to contact me in Mattermost if you want to discuss something! :-)
| */ | ||
| void | ||
| t8_forest_partition (t8_forest_t forest); | ||
| t8_forest_partition (t8_forest_t forest, t8_weight_fcn_t *weight_callback); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to add the weight_callback as an extra argument here? To me, it seems like an unnecessary duplication, given that the forest already has the new member weight_function.
My suggestion would be to omit the extra argument here and then use forest->weight_function within t8_forest_partition rather than the function argument? Otherwise my impression is that we are bound to sooner or later run into hard-to-debug problems of the argument and the member being out of sync.
What do you think?
| */ | ||
| void | ||
| t8_forest_partition (t8_forest_t forest); | ||
| t8_forest_partition (t8_forest_t forest, t8_weight_fcn_t *weight_callback); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to add the weight_callback as an extra argument here? To me, it seems like an unnecessary duplication, given that the forest already has the new member weight_function.
My suggestion would be to omit the extra argument here and then use forest->weight_function within t8_forest_partition rather than the function argument? Otherwise my impression is that we are bound to sooner or later run into hard-to-debug problems of the argument and the member being out of sync.
What do you think?
| t8_forest_partition (t8_forest_t forest, t8_weight_fcn_t *weight_callback); | |
| t8_forest_partition (t8_forest_t forest); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm actually on the fence regarding this, and it probably shows in the code...
You definitely have a point here, and this problem should be addressed.
I was actually tempted at some point to simply remove the additional parameter and rely on forest->weight_function as you suggest, but from a design perspective this feels just wrong. Conceptually, the weight function is not a property of the forest data structure and should not be stored in the forest. The weight function is a user customization point of the partitioning algorithm. It therefore makes much more sense to have the callback be passed explicitly to the partition algorithm (like you would do when calling std::transform or std::reduce for exemple), and get rid of forest->weight_function instead. However, I could not find a way to make this work because the current API for forest mutation is fundamentally designed around this idea of attaching these customization points to the forest.
This is obviously a problem that goes beyond the scope of this feature and that will have to be addressed at some point. In the meantime, we can either:
- Go with the simple and consistent approach, but this means doubling down on this tech debt.
- Implement a proper separation of concerns in new code to avoid making things worse, at the cost of some inconsistencies across the codebase, and eventually some breaking changes in the APIs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For reference, here is a possible solution to make these APIs extensible and future-proof:
// public_api.h
struct t8_partition_options {
t8_weight_fcn_t* weight_fcn;
int set_for_coarsening;
};
extern const t8_partition_options default_values;
t8forest_t t8_partition(t8forest_t forest_from, t8_partition_options options);
// client_code.c
t8_partition_options partition_options = default_values;
partition_options.weight_fcn = /*...*/ ;
t8forest_t new_forest = t8_partition( old_forest, partition_options );There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the detailed explanation. I discussed with @holke about this today. We do see your point about the design flaws of storing information like the weight function as member data of the forest; however, we would still suggest to stick to it in this PR and go for a function t8_forest_set_partition_weights for the following reasons:
- it would be consistent with how this kind of data is handled so far in t8code (whether we like it or not);
- it would render / keep this PR rather non-intrusive, in particular by not changing the function interfaces;
- we might change this forest-based design in the future, but we do not have immediate plans to do so (because we also don't have the resources right now). Once we change it, it would be a quite deep change that won't get significantly more extensive by the weighted load balancing.
Would it be okay with you if we go forward with this approach?
Speaking of "we": If you prefer that I can also take over the PR and change it accordingly...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be okay with you if we go forward with this approach?
That makes perfect sense, and I understand the consistency and backwards compatibility argument. I just wanted to make sure that this specific design aspect was taken into account before moving forward with this. At the end of the day, you guys have the broader and more insightful vision into this matter, and I trust your jugement.
Speaking of "we": If you prefer that I can also take over the PR and change it accordingly...
I guess it depends on how fast you'd like this to get merged: I'm a little bit over-subscribed at the moment so it will probably have to wait a little bit on my side. If you want to get this merged now, feel free to take over from here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok then I try to free some time this week to finalize it 👍
| // Compute forest->element_offsets according to the weight function, if provided | ||
| static void | ||
| t8_forest_partition_compute_new_offset (t8_forest_t forest) | ||
| t8_forest_partition_compute_new_offset (t8_forest_t forest, t8_weight_fcn_t *weight_fcn) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as in my comment below for t8_forest_partition: How about omitting the new argument here and using forest->weight_function within this function instead?
| t8_forest_set_partition (t8_forest_t forest, const t8_forest_t set_from, int set_for_coarsening, | ||
| t8_weight_fcn_t *weight_callback); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After discussing with my colleagues, I suggest not to introduce the remove the weight function as an argument of t8_forest_set_partition, but to instead introduce a separate function t8_forest_set_partition_weights that assigns weight function to a forest.
This way, we can avoid having to update all examples, tutorials. etc and do not need to deal with nullptr that much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I 100% hear the backwards compatibility argument, but this would result in a user API that is quite counter-intuitive. Moreover, I would argue that this API will have to be reworked at some point anyway, because introducing a new function for each new customization point is not sustainable long-term
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As touched upon above, you are right that this strategy does not scale well with an increasing number of options. For this specific feature, however, we still think it would be preferable to go along with it, because it is so simple.
|
|
||
| /* partition the forest */ | ||
| /* partition the forest with a unit weight function */ | ||
| t8_forest_t forest_partition; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest to revert this test case to its original state, since it is now really about testing the weighted load balancing. Do you happen to already have some other test case you used locally for validation? Otherwise, I can also add a suitable test case in a subsequent PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that this is not the best approach to test this feature, and adding a dedicated test suite for the partitioning algorithm is the right way to do things. However, beyond comparing the result of the unweighted algorithm with the weighted one with a constant weight callback, I don't know how to test this in a robust way. For development purposes, I have used a lot of intrusive whitebox testing manually, but this would not be very helpful in a proper automated test suite...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, then I suggest you leave the test case as it currently is in this PR and I will move the weighted partitioning to another test case in a later PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sharing a (probably half-baked) thought I had this morning about how to test this feature in the general case in a non-intrusive and hopefully robust way:
Given a non-trivial weight function, we could manually compute a posteriori each partition's weight, find the global min and max, and assert that the difference between the two is of the order of magnitude of a single element's weight.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting idea, thanks for sharing! I thought about maybe having some simple weight function that is for example 1 for the first half of the elements and 2 for the other, or that is maybe just equal to the element number. I'm optimistic to find something useful here 👍
Co-authored-by: spenke91 <[email protected]>
Co-authored-by: spenke91 <[email protected]>
Co-authored-by: spenke91 <[email protected]>
spenke91
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the quick replies! I discussed with @holke today about this PR and here's what we think.
|
|
||
| /* partition the forest */ | ||
| /* partition the forest with a unit weight function */ | ||
| t8_forest_t forest_partition; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, then I suggest you leave the test case as it currently is in this PR and I will move the weighted partitioning to another test case in a later PR.
| */ | ||
| void | ||
| t8_forest_partition (t8_forest_t forest); | ||
| t8_forest_partition (t8_forest_t forest, t8_weight_fcn_t *weight_callback); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the detailed explanation. I discussed with @holke about this today. We do see your point about the design flaws of storing information like the weight function as member data of the forest; however, we would still suggest to stick to it in this PR and go for a function t8_forest_set_partition_weights for the following reasons:
- it would be consistent with how this kind of data is handled so far in t8code (whether we like it or not);
- it would render / keep this PR rather non-intrusive, in particular by not changing the function interfaces;
- we might change this forest-based design in the future, but we do not have immediate plans to do so (because we also don't have the resources right now). Once we change it, it would be a quite deep change that won't get significantly more extensive by the weighted load balancing.
Would it be okay with you if we go forward with this approach?
Speaking of "we": If you prefer that I can also take over the PR and change it accordingly...
| t8_forest_set_partition (t8_forest_t forest, const t8_forest_t set_from, int set_for_coarsening, | ||
| t8_weight_fcn_t *weight_callback); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As touched upon above, you are right that this strategy does not scale well with an increasing number of options. For this specific feature, however, we still think it would be preferable to go along with it, because it is so simple.
spenke91
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your replies and once again for your contribution 👍
| */ | ||
| void | ||
| t8_forest_partition (t8_forest_t forest); | ||
| t8_forest_partition (t8_forest_t forest, t8_weight_fcn_t *weight_callback); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok then I try to free some time this week to finalize it 👍
|
|
||
| /* partition the forest */ | ||
| /* partition the forest with a unit weight function */ | ||
| t8_forest_t forest_partition; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting idea, thanks for sharing! I thought about maybe having some simple weight function that is for example 1 for the first half of the elements and 2 for the other, or that is maybe just equal to the element number. I'm optimistic to find something useful here 👍
Describe your changes here:
This PR makes it possible to define element weights for the partitioning algorithm. This is achieved via a callback pointer. In the null case, the old unweighted algorithm is used. This introduces one breaking change in the public API, in the
t8_forest_set_partitionfunction signature. All the client code in the repo has been updated, but this will require an update of external user code:t8_forest_set_partition(forest, set_from, set_for_coarsening)=>t8_forest_set_partition(forest, set_from, set_for_coarsening, nullptr)Design tradeoffs:
The new algorithm replaces the old one, even if no weight function is provided. Keeping the old algorithm around could possibly result in slightly better performance since it requires less work, but this would come at the cost of increased maintenance and the need for duplicated testing to cover both code paths. Given that the partition algorithm is not a performance bottleneck in practice, the single code path solution was preferred.The final design includes the unweighted case as a fast "early exit" case.All these boxes must be checked by the AUTHOR before requesting review:
Documentation:,Bugfix:,Feature:,Improvement:orOther:.All these boxes must be checked by the REVIEWERS before merging the pull request:
As a reviewer please read through all the code lines and make sure that the code is fully understood, bug free, well-documented and well-structured.
General
Tests
If the Pull request introduces code that is not covered by the github action (for example coupling with a new library):
Scripts and Wiki
script/find_all_source_files.scpto check the indentation of these files.License
doc/(or already has one).