-
Notifications
You must be signed in to change notification settings - Fork 104
Automatically select apropriate parallel slice #3094
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: fci-auto-with-debug-higher-order
Are you sure you want to change the base?
Conversation
Useful for Field3D with parallel slices, so we can write `f[i.yp()]` instead of `f.yup()[i.yp()]`
This is needed wo we can both write `f[i.yp()]` while allowing old code to call `f.yup()[i.yp()]` as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
Interpolation methods might need that, but in general it is unlike to be of use.
The tests are probably not meant to work with parallel slices.
|
This is really neat, but my major worry is having any conditionals in |
|
Adding a new type would require a lot of new code. Do you have some benchmarks that suggest that this a significant overhead? |
|
We did a lot of benchmarking when we first made |
|
Do you have a suggestion for a performance test case? |
|
I just realised with this and can be simplified to if which we probably have already :-) |
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
| } | ||
|
|
||
| inline SpecificInd xp(int dx = 1) const { return {ind + (dx * ny * nz), ny, nz}; } | ||
| inline SpecificInd xp(int dx = 1) const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: function 'xp' has inline specifier but is implicitly inlined [readability-redundant-inline-specifier]
| inline SpecificInd xp(int dx = 1) const { | |
| SpecificInd xp(int dx = 1) const { |
| return {ind + (dy * nz), ny, nz}; | ||
| return {ind + (dy * nz), ny, nz, yoffset + dy}; | ||
| } | ||
| inline SpecificInd yp_no_parallel_shift(int dy = 1) const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: function 'yp_no_parallel_shift' has inline specifier but is implicitly inlined [readability-redundant-inline-specifier]
| inline SpecificInd yp_no_parallel_shift(int dy = 1) const { | |
| SpecificInd yp_no_parallel_shift(int dy = 1) const { |
| // Add region for parallel slices | ||
| for (int i = 1; i <= ystart; ++i) { | ||
| for (const int offset : {-i, i}) { | ||
| const auto region = fmt::format("RGN_YPAR_{:+d}", offset); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: no header providing "fmt::format" is directly included [misc-include-cleaner]
src/mesh/mesh.cxx:12:
- #include "impls/bout/boutmesh.hxx"
+ #include "fmt/core.h"
+ #include "impls/bout/boutmesh.hxx"There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
| #include "bout/assert.hxx" | ||
| #include "bout/bout_types.hxx" | ||
| #include "bout/boutexception.hxx" | ||
| #include "bout/field.hxx" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: included header boutexception.hxx is not used directly [misc-include-cleaner]
| #include "bout/field.hxx" | |
| #include "bout/field.hxx" |
include/bout/field3d.hxx
Outdated
| ASSERT2(d.yoffset == yoffset); | ||
| } | ||
| } | ||
| return data[d.ind]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: expected member name or ';' after declaration specifiers [clang-diagnostic-error]
return data[d.ind];
^There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
| DirectionTypes directions_in = {YDirectionType::Standard, | ||
| ZDirectionType::Average}, | ||
| std::optional<size_t> region = {}); | ||
| std::optional<size_t> region = {}, int yoffset = 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: no header providing "size_t" is directly included [misc-include-cleaner]
include/bout/field2d.hxx:26:
- class Field2D;
+ #include <cstddef>
+ class Field2D;| DirectionTypes directions_in = {YDirectionType::Standard, | ||
| ZDirectionType::Average}, | ||
| std::optional<size_t> region = {}); | ||
| std::optional<size_t> region = {}, int yoffset = 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: no header providing "std::optional" is directly included [misc-include-cleaner]
include/bout/field2d.hxx:26:
- class Field2D;
+ #include <optional>
+ class Field2D;| DirectionTypes directions_in = {YDirectionType::Standard, | ||
| ZDirectionType::Standard}, | ||
| std::optional<size_t> regionID = {}); | ||
| std::optional<size_t> regionID = {}, int yoffset = 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: no header providing "size_t" is directly included [misc-include-cleaner]
include/bout/field3d.hxx:22:
- class Field3D;
+ #include <cstddef>
+ class Field3D;| return data[d.ind]; | ||
| } | ||
| const BoutReal& operator[](const Ind3D& d) const { | ||
| return (*const_cast<Field3D*>(this))[d]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: do not use const_cast to remove const qualifier [cppcoreguidelines-pro-type-const-cast]
return (*const_cast<Field3D*>(this))[d];
^| Options* track(const BoutReal& change, std::string operation); | ||
|
|
||
| public: | ||
| int yoffset{0}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: member variable 'yoffset' has public visibility [cppcoreguidelines-non-private-member-variables-in-classes]
int yoffset{0};
^|
|
||
| FieldPerp::FieldPerp(Mesh* localmesh, CELL_LOC location_in, int yindex_in, | ||
| DirectionTypes directions, std::optional<size_t> UNUSED(regionID)) | ||
| DirectionTypes directions, std::optional<size_t> UNUSED(regionID), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: no header providing "UNUSED" is directly included [misc-include-cleaner]
src/field/fieldperp.cxx:25:
- #include <bout/boutcomm.hxx>
+ #include "bout/unused.hxx"
+ #include <bout/boutcomm.hxx>|
|
||
| FieldPerp::FieldPerp(Mesh* localmesh, CELL_LOC location_in, int yindex_in, | ||
| DirectionTypes directions, std::optional<size_t> UNUSED(regionID)) | ||
| DirectionTypes directions, std::optional<size_t> UNUSED(regionID), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: no header providing "size_t" is directly included [misc-include-cleaner]
src/field/fieldperp.cxx:29:
+ #include <cstddef>|
|
||
| FieldPerp::FieldPerp(Mesh* localmesh, CELL_LOC location_in, int yindex_in, | ||
| DirectionTypes directions, std::optional<size_t> UNUSED(regionID)) | ||
| DirectionTypes directions, std::optional<size_t> UNUSED(regionID), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: no header providing "std::optional" is directly included [misc-include-cleaner]
src/field/fieldperp.cxx:29:
+ #include <optional>| { | ||
| auto indc = ind; | ||
| indc.yoffset = 1; | ||
| data[stripe_size * ind.ind + static_cast<int>(Offset::Byup)] = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: '*' has higher precedence than '+'; add parentheses to explicitly specify the order of operations [readability-math-missing-parentheses]
| data[stripe_size * ind.ind + static_cast<int>(Offset::Byup)] = | |
| data[(stripe_size * ind.ind) + static_cast<int>(Offset::Byup)] = coords->Bxy.yup()[indc]; |
| indc.yoffset = 1; | ||
| data[stripe_size * ind.ind + static_cast<int>(Offset::Byup)] = | ||
| coords->Bxy.yup()[indc]; | ||
| indc.yoffset = -1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: '*' has higher precedence than '+'; add parentheses to explicitly specify the order of operations [readability-math-missing-parentheses]
| indc.yoffset = -1; | |
| data[(stripe_size * ind.ind) + static_cast<int>(Offset::Bydown)] = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
|
|
||
| FieldPerp::FieldPerp(Mesh* localmesh, CELL_LOC location_in, int yindex_in, | ||
| DirectionTypes directions, std::optional<size_t> UNUSED(regionID)) | ||
| DirectionTypes directions, std::optional<size_t> UNUSED(regionID), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: no header providing "DirectionTypes" is directly included [misc-include-cleaner]
DirectionTypes directions, std::optional<size_t> UNUSED(regionID),
^| { | ||
| auto indc = ind; | ||
| indc.yoffset = 1; | ||
| data[stripe_size * ind.ind + static_cast<int>(Offset::Byup)] = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: '*' has higher precedence than '+'; add parentheses to explicitly specify the order of operations [readability-math-missing-parentheses]
| data[stripe_size * ind.ind + static_cast<int>(Offset::Byup)] = | |
| data[(stripe_size * ind.ind) + static_cast<int>(Offset::Byup)] = |
| data[stripe_size * ind.ind + static_cast<int>(Offset::Byup)] = | ||
| coords->Bxy.yup()[indc]; | ||
| indc.yoffset = -1; | ||
| data[stripe_size * ind.ind + static_cast<int>(Offset::Bydown)] = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: '*' has higher precedence than '+'; add parentheses to explicitly specify the order of operations [readability-math-missing-parentheses]
| data[stripe_size * ind.ind + static_cast<int>(Offset::Bydown)] = | |
| data[(stripe_size * ind.ind) + static_cast<int>(Offset::Bydown)] = |
|
Profiled with hyperfine: Unit tests in Release with CHECKS=0:
Unit tests in Debug with CHECKS=2:
test-drift-instability integration test in Release with CHECKS=0:
test-drift-instability integration test in Debug with CHECKS=2:
|
|
Amazing, thanks @tomc271! So it seems there is quite some overhead (8%) without checks. Looking at the absolute numbers for I am slightly worried about the 8% - that is probably not great. It would make the code however more readable, which also a benefit for non-FCI users. Maybe we could make make a compile time switch to enable parallel slices, so that users that do not use it, can keep it disabled. However, then we either cannot start using this feature and clean up the code, or we are kind of forced to stay with it, even if we cannot come up with a faster implementation ... |
|
|
||
| BoutReal& operator[](const Ind3D& d) { return data[d.ind]; } | ||
| const BoutReal& operator[](const Ind3D& d) const { return data[d.ind]; } | ||
| BoutReal& operator[](const Ind3D& d) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It has been a long time since I've looked at this code but could this be return ynext(d.yoffset - yoffset)[d]? One could then still include some of the consistency checking but it would move some of the logic into CHECK regions that disappear in production runs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return ynext(d.yoffset - yoffset)[d] would always call ynext - even if this already the right field, and thus would fail.
The issue is that both the main field, which contains parallel fields, as well as the parallel fields, are of type Field3D.
We could (together with PR #3149) only enable this for Field3DParallel. There might be subtle bugs, where it depends on the type (Field3DParallel/Field3D) on whether it uses the correct indexing. But we could add those with higher CHECK level, so that should be fine.
| return ynext(d.yoffset)[d]; | ||
| } | ||
| #if CHECK >= 2 | ||
| if (isFci()) { // We probably should assert here that this is field aligned |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this be if ( ! hasParallelSlices() && isFci()) and placed before the return line so the return doesn't need to be guarded?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would still need to be guarded - because for FA there are no parallel slices, and thus we need to use the return at the end of the function.
This allows to replace
f.yup()[i.yp()]withf[i.yp()].That should allow to simplify some code, and will make more code fci-correct.