Skip to content

Conversation

@dschwoerer
Copy link
Contributor

@dschwoerer dschwoerer commented Mar 18, 2025

This allows to replace f.yup()[i.yp()] with f[i.yp()].

That should allow to simplify some code, and will make more code fci-correct.

dschwoerer and others added 5 commits March 18, 2025 15:32
Useful for Field3D with parallel slices, so we can write `f[i.yp()]`
instead of `f.yup()[i.yp()]`
This is needed wo we can both write `f[i.yp()]` while allowing old code
to call `f.yup()[i.yp()]` as well.
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

@ZedThree
Copy link
Member

ZedThree commented Jun 9, 2025

This is really neat, but my major worry is having any conditionals in operator[] will completely kill vectorisation. What if adding a yoffset to an Ind made a new type, and we add a new overload of operator[]? Then at least most indexing wouldn't have these conditionals in.

@dschwoerer
Copy link
Contributor Author

Adding a new type would require a lot of new code. Do you have some benchmarks that suggest that this a significant overhead?

@ZedThree
Copy link
Member

We did a lot of benchmarking when we first made Region and Ind, and vectorisation was quite temperamental and made a lot of difference to performance. operator[] is going to be at the bottom of a lot of tight loops, so we do have to be careful here

@dschwoerer
Copy link
Contributor Author

Do you have a suggestion for a performance test case?
I do agree we should make sure it does not cause (significant) performance issues ...

@dschwoerer
Copy link
Contributor Author

I just realised with this and Field3DParallel

void floor_all(Field3D &f, BoutReal val) {
  alloc_all(f);
  BOUT_FOR(i, f.getRegion("RGN_ALL")) {
    f[i] = floor(f,val);
    f.yup()[i] = floor(f,val);
    f.ydown()[i] = floor(f,val);
  }
}

can be simplified to

void floor(Field3DParallel &f, BoutReal val) {
  BOUT_FOR(i, f.getRegion("RGN_ALL")) {
    f[i] = floor(f,val);
  }
}

if getRegion contains also shifted indices.
That can then just be

template <class F>
void floor(F &f, BoutReal val) {
  BOUT_FOR(i, f.getRegion("RGN_ALL")) {
    f[i] = floor(f,val);
  }
}

which we probably have already :-)

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

}

inline SpecificInd xp(int dx = 1) const { return {ind + (dx * ny * nz), ny, nz}; }
inline SpecificInd xp(int dx = 1) const {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: function 'xp' has inline specifier but is implicitly inlined [readability-redundant-inline-specifier]

Suggested change
inline SpecificInd xp(int dx = 1) const {
SpecificInd xp(int dx = 1) const {

return {ind + (dy * nz), ny, nz};
return {ind + (dy * nz), ny, nz, yoffset + dy};
}
inline SpecificInd yp_no_parallel_shift(int dy = 1) const {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: function 'yp_no_parallel_shift' has inline specifier but is implicitly inlined [readability-redundant-inline-specifier]

Suggested change
inline SpecificInd yp_no_parallel_shift(int dy = 1) const {
SpecificInd yp_no_parallel_shift(int dy = 1) const {

// Add region for parallel slices
for (int i = 1; i <= ystart; ++i) {
for (const int offset : {-i, i}) {
const auto region = fmt::format("RGN_YPAR_{:+d}", offset);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: no header providing "fmt::format" is directly included [misc-include-cleaner]

src/mesh/mesh.cxx:12:

- #include "impls/bout/boutmesh.hxx"
+ #include "fmt/core.h"
+ #include "impls/bout/boutmesh.hxx"

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

#include "bout/assert.hxx"
#include "bout/bout_types.hxx"
#include "bout/boutexception.hxx"
#include "bout/field.hxx"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: included header boutexception.hxx is not used directly [misc-include-cleaner]

Suggested change
#include "bout/field.hxx"
#include "bout/field.hxx"

ASSERT2(d.yoffset == yoffset);
}
}
return data[d.ind];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: expected member name or ';' after declaration specifiers [clang-diagnostic-error]

    return data[d.ind];
    ^

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

DirectionTypes directions_in = {YDirectionType::Standard,
ZDirectionType::Average},
std::optional<size_t> region = {});
std::optional<size_t> region = {}, int yoffset = 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: no header providing "size_t" is directly included [misc-include-cleaner]

include/bout/field2d.hxx:26:

- class Field2D;
+ #include <cstddef>
+ class Field2D;

DirectionTypes directions_in = {YDirectionType::Standard,
ZDirectionType::Average},
std::optional<size_t> region = {});
std::optional<size_t> region = {}, int yoffset = 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: no header providing "std::optional" is directly included [misc-include-cleaner]

include/bout/field2d.hxx:26:

- class Field2D;
+ #include <optional>
+ class Field2D;

DirectionTypes directions_in = {YDirectionType::Standard,
ZDirectionType::Standard},
std::optional<size_t> regionID = {});
std::optional<size_t> regionID = {}, int yoffset = 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: no header providing "size_t" is directly included [misc-include-cleaner]

include/bout/field3d.hxx:22:

- class Field3D;
+ #include <cstddef>
+ class Field3D;

return data[d.ind];
}
const BoutReal& operator[](const Ind3D& d) const {
return (*const_cast<Field3D*>(this))[d];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: do not use const_cast to remove const qualifier [cppcoreguidelines-pro-type-const-cast]

    return (*const_cast<Field3D*>(this))[d];
             ^

Options* track(const BoutReal& change, std::string operation);

public:
int yoffset{0};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: member variable 'yoffset' has public visibility [cppcoreguidelines-non-private-member-variables-in-classes]

  int yoffset{0};
      ^


FieldPerp::FieldPerp(Mesh* localmesh, CELL_LOC location_in, int yindex_in,
DirectionTypes directions, std::optional<size_t> UNUSED(regionID))
DirectionTypes directions, std::optional<size_t> UNUSED(regionID),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: no header providing "UNUSED" is directly included [misc-include-cleaner]

src/field/fieldperp.cxx:25:

- #include <bout/boutcomm.hxx>
+ #include "bout/unused.hxx"
+ #include <bout/boutcomm.hxx>


FieldPerp::FieldPerp(Mesh* localmesh, CELL_LOC location_in, int yindex_in,
DirectionTypes directions, std::optional<size_t> UNUSED(regionID))
DirectionTypes directions, std::optional<size_t> UNUSED(regionID),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: no header providing "size_t" is directly included [misc-include-cleaner]

src/field/fieldperp.cxx:29:

+ #include <cstddef>


FieldPerp::FieldPerp(Mesh* localmesh, CELL_LOC location_in, int yindex_in,
DirectionTypes directions, std::optional<size_t> UNUSED(regionID))
DirectionTypes directions, std::optional<size_t> UNUSED(regionID),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: no header providing "std::optional" is directly included [misc-include-cleaner]

src/field/fieldperp.cxx:29:

+ #include <optional>

{
auto indc = ind;
indc.yoffset = 1;
data[stripe_size * ind.ind + static_cast<int>(Offset::Byup)] =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: '*' has higher precedence than '+'; add parentheses to explicitly specify the order of operations [readability-math-missing-parentheses]

Suggested change
data[stripe_size * ind.ind + static_cast<int>(Offset::Byup)] =
data[(stripe_size * ind.ind) + static_cast<int>(Offset::Byup)] = coords->Bxy.yup()[indc];

indc.yoffset = 1;
data[stripe_size * ind.ind + static_cast<int>(Offset::Byup)] =
coords->Bxy.yup()[indc];
indc.yoffset = -1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: '*' has higher precedence than '+'; add parentheses to explicitly specify the order of operations [readability-math-missing-parentheses]

Suggested change
indc.yoffset = -1;
data[(stripe_size * ind.ind) + static_cast<int>(Offset::Bydown)] =

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions


FieldPerp::FieldPerp(Mesh* localmesh, CELL_LOC location_in, int yindex_in,
DirectionTypes directions, std::optional<size_t> UNUSED(regionID))
DirectionTypes directions, std::optional<size_t> UNUSED(regionID),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: no header providing "DirectionTypes" is directly included [misc-include-cleaner]

                     DirectionTypes directions, std::optional<size_t> UNUSED(regionID),
                     ^

{
auto indc = ind;
indc.yoffset = 1;
data[stripe_size * ind.ind + static_cast<int>(Offset::Byup)] =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: '*' has higher precedence than '+'; add parentheses to explicitly specify the order of operations [readability-math-missing-parentheses]

Suggested change
data[stripe_size * ind.ind + static_cast<int>(Offset::Byup)] =
data[(stripe_size * ind.ind) + static_cast<int>(Offset::Byup)] =

data[stripe_size * ind.ind + static_cast<int>(Offset::Byup)] =
coords->Bxy.yup()[indc];
indc.yoffset = -1;
data[stripe_size * ind.ind + static_cast<int>(Offset::Bydown)] =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: '*' has higher precedence than '+'; add parentheses to explicitly specify the order of operations [readability-math-missing-parentheses]

Suggested change
data[stripe_size * ind.ind + static_cast<int>(Offset::Bydown)] =
data[(stripe_size * ind.ind) + static_cast<int>(Offset::Bydown)] =

@tomc271
Copy link
Collaborator

tomc271 commented Aug 20, 2025

Profiled with hyperfine:
(Minimum time from 50 runs)

Unit tests in Release with CHECKS=0:

Test Min time (s) % of next
next 1.03 100%
yup-ydown-auto 1.05 102%
master 1.10 107%

Unit tests in Debug with CHECKS=2:

Test Min time (s) % of next
next 5.40 100%
yup-ydown-auto 5.53 102%
master 5.68 105%

test-drift-instability integration test in Release with CHECKS=0:

Test Min time (s) % of next
next 1.26 100%
yup-ydown-auto 1.36 108%
master 1.19 95%

test-drift-instability integration test in Debug with CHECKS=2:

Test Min time (s) % of next
next 11.08 100%
yup-ydown-auto 12.88 116%
master 10.81 98%

@dschwoerer
Copy link
Contributor Author

Amazing, thanks @tomc271! So it seems there is quite some overhead (8%) without checks. Looking at the absolute numbers for CHECK=2, I would argue that the 16% is perfectly fine.

I am slightly worried about the 8% - that is probably not great.
So probably worth to hold of the merging, and see whether we/I can come up with a better implementation. Templates could be used, then some of the checks could be done at compile time, but that requires more time to think about it ...

It would make the code however more readable, which also a benefit for non-FCI users. Maybe we could make make a compile time switch to enable parallel slices, so that users that do not use it, can keep it disabled. However, then we either cannot start using this feature and clean up the code, or we are kind of forced to stay with it, even if we cannot come up with a faster implementation ...


BoutReal& operator[](const Ind3D& d) { return data[d.ind]; }
const BoutReal& operator[](const Ind3D& d) const { return data[d.ind]; }
BoutReal& operator[](const Ind3D& d) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It has been a long time since I've looked at this code but could this be return ynext(d.yoffset - yoffset)[d]? One could then still include some of the consistency checking but it would move some of the logic into CHECK regions that disappear in production runs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return ynext(d.yoffset - yoffset)[d] would always call ynext - even if this already the right field, and thus would fail.

The issue is that both the main field, which contains parallel fields, as well as the parallel fields, are of type Field3D.

We could (together with PR #3149) only enable this for Field3DParallel. There might be subtle bugs, where it depends on the type (Field3DParallel/Field3D) on whether it uses the correct indexing. But we could add those with higher CHECK level, so that should be fine.

return ynext(d.yoffset)[d];
}
#if CHECK >= 2
if (isFci()) { // We probably should assert here that this is field aligned
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be if ( ! hasParallelSlices() && isFci()) and placed before the return line so the return doesn't need to be guarded?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would still need to be guarded - because for FA there are no parallel slices, and thus we need to use the return at the end of the function.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants