-
Notifications
You must be signed in to change notification settings - Fork 104
Pvcode + Cvode improvements #2889
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: next
Are you sure you want to change the base?
Conversation
dschwoerer
commented
Mar 19, 2024
- Exposes more options to the user.
- Allows to dump if the solver fails. This can be useful for figuring out why the solver is slow. Currently only for pvode, but should also be possible to implement this for cvode (for cvode we can use a documented API)
Use the new track feature (better name required) to dump the different components of the ddt() as well as the residuum for the evolved fields.
This keeps track of all the changes done to the field and stores them to a OptionsObject.
This keeps track of all the changes done to the field and stores them to a OptionsObject.
| #undef FIELD_FUNC | ||
|
|
||
| template <typename T, typename = bout::utils::EnableIfField<T>, class... Types> | ||
| inline T setName(T&& f, const std::string& name, Types... args) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think most other setters like this are member functions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To make this a member function, make it virtual:
field.hxx:
virtual Field& setName(std::string name) {
self->name = std::move(name);
return *self;
}field2d.hxx:
Field2D& setName(std::string name) override {
Field::setName(name);
return *self;
}Formatting will have to be done at the calling site, but I think that's fine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would really love this to be not be done on the caller side, because then the formatting can be easily disabled, and the overhead disappears. If that is not done, in order to avoid the overhead of name tracking, all the calls need to be done using #if guards.
from https://en.cppreference.com/w/cpp/language/member_template
A member function template cannot be virtual, and a member function template in a derived class cannot override a virtual member function from the base class.
So I guess we would be stuck with not having the member function on the base class, but only on the derived class.
|
My worry here is the tracking storing a lot of data if it's storing the full |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
It could be, but I don't think it is needed. I would personally keep it enabled for production runs, as it has a runtime switch, that is disabled by default. And I would also keep it like this, at least by default. If a simulation crashes, it can be very useful to immediately get why it failed (this is currently the only use of the name tracking). Having to restart + queue again, just to get to the point where it crashes can be very time consuming, and the cost should be really negligable, if disabled. I did notice a slow down in my simulations with this enabled, but that was when it was enabled all the time and the names did not get reset, thus the names have been growing exponentially. |
|
On 3/19/24 16:34, Peter Hill wrote:
***@***.**** commented on this pull request.
------------------------------------------------------------------------
In include/bout/field.hxx
<#2889 (comment)>:
> @@ -683,4 +683,12 @@ inline T floor(const T& var, BoutReal f, const std::string& rgn = "RGN_ALL") {
#undef FIELD_FUNC
+template <typename T, typename = bout::utils::EnableIfField<T>, class... Types>
+inline T setName(T&& f, const std::string& name, Types... args) {
I think most other setters like this are member functions
But then they return either Field, or need to be overwritten.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
src/solver/impls/pvode/pvode.cxx
Outdated
| // Check return flag | ||
| if (flag != SUCCESS) { | ||
| output_error.write("ERROR CVODE step failed, flag = {:d}\n", flag); | ||
| CVodeMemRec* cv_mem = (CVodeMem)cvode_mem; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: do not use C-style cast to convert between unrelated types [cppcoreguidelines-pro-type-cstyle-cast]
CVodeMemRec* cv_mem = (CVodeMem)cvode_mem;
^|
The cuda build fails with: The relevant code is: (*tracking)[outname].setAttributes({
{"operation", operation},
#if BOUT_USE_TRACK
{"rhs.name", change.name},
#endif
});Is gcc9.4 not able to use setAttributes? |
We do try to completely remove debugging features for production runs, see
This is true, but I would really like to understand the costs. It looks like it's storing potentially hundreds or thousands of copies of full fields, so although it looks like a really interesting and useful feature, it would be good to have a good idea of the costs.
Is that with the current form?
It should almost certainly return |
If at all, I would be in favor of adding a new configure flag. I use this in runs with CHECK=0.
It is only storing fields, if the solver has failed. Unless that happens, no additional storage is used. In the case the solver crashes, I think the cost is really negligible. This is only done once, if the solver decided it cannot continue. For hermes-2, the dump file contains 53 But if BOUT++ runs out of memory here, it would not be much worse then if not - the simulation finishes anyway.
No, currently it is not noticeable. I activate it for all my runs.
That is exactly what I tried: diff --git a/include/bout/field.hxx b/include/bout/field.hxx
index 04035f5b7..488aecaab 100644
--- a/include/bout/field.hxx
+++ b/include/bout/field.hxx
@@ -84,6 +84,14 @@ public:
return *this;
}
+ template <class... Types>
+ inline Field& setName(const std::string& name, Types... args) {
+#if BOUT_USE_TRACK
+ this->name = fmt::format(name, args...);
+#endif
+ return *this;
+ }
+
std::string name;
#if CHECK > 0
@@ -683,19 +691,4 @@ inline T floor(const T& var, BoutReal f, const std::string& rgn = "RGN_ALL") {
#undef FIELD_FUNC
-template <typename T, typename = bout::utils::EnableIfField<T>, class... Types>
-inline void setName(T& f, const std::string& name, Types... args) {
-#if BOUT_USE_TRACK
- f.name = fmt::format(name, args...);
-#endif
-}
-
-template <typename T, typename = bout::utils::EnableIfField<T>, class... Types>
-inline T setName(T&& f, const std::string& name, Types... args) {
-#if BOUT_USE_TRACK
- f.name = fmt::format(name, args...);
-#endif
- return f;
-}
-
#endif /* FIELD_H */
diff --git a/src/mesh/coordinates.cxx b/src/mesh/coordinates.cxx
index 32774d622..643e148b1 100644
--- a/src/mesh/coordinates.cxx
+++ b/src/mesh/coordinates.cxx
@@ -1542,7 +1542,7 @@ Field3D Coordinates::Grad_par(const Field3D& var, CELL_LOC outloc,
TRACE("Coordinates::Grad_par( Field3D )");
ASSERT1(location == outloc || outloc == CELL_DEFAULT);
- return setName(::DDY(var, outloc, method) * invSg(), "Grad_par({:s})", var.name);
+ return (::DDY(var, outloc, method) * invSg()).setName("Grad_par({:s})", var.name);
}
/////////////////////////////////////////////////////////
@@ -1601,7 +1601,7 @@ Field3D Coordinates::Div_par(const Field3D& f, CELL_LOC outloc,
f_B.yup(i) = f.yup(i) / Bxy_floc.yup(i);
f_B.ydown(i) = f.ydown(i) / Bxy_floc.ydown(i);
}
- return setName(Bxy * Grad_par(f_B, outloc, method), "Div_par({:s})", f.name);
+ return (Bxy * Grad_par(f_B, outloc, method)).setName("Div_par({:s})", f.name);
}
/////////////////////////////////////////////////////////
diff --git a/src/solver/impls/pvode/pvode.cxx b/src/solver/impls/pvode/pvode.cxx
index db28f64d8..c36985ae0 100644
--- a/src/solver/impls/pvode/pvode.cxx
+++ b/src/solver/impls/pvode/pvode.cxx
@@ -376,7 +376,7 @@ BoutReal PvodeSolver::run(BoutReal tout) {
for (auto& f : f3d) {
f.F_var->enableTracking(fmt::format("ddt_{:s}", f.name), debug);
- setName(*f.var, f.name);
+ f.var->setName(f.name);
}
run_rhs(simtime);gcc14 complains with: |
gcc 9.4 is unable to correctly parse the construction for the function argument, if name.change is used directly. First making a copy seems to work around that issue.
|
@dschwoerer After discussing this with @bendudson, we're quite concerned about the potential overhead of the temporary field dumping mechanism, both in terms of memory and performance, for what seems like a small benefit when debugging. We'd really like to see some performance analysis and scaling to see it doesn't have an affect when disabled. I'm also worried that we'd need to see Could something similar not be achieved with a custom monitor instead? Exposing more CVODE options to the user seems very straightforward and useful, so maybe that could be split out? |
But the memory overhead is only if the solver fails. Do you still think at that point that is an issue?
I can certainly run blob2d for this branch and next. But I am certain there are no significant differences, thus I haven't done it. Would you like to see something else? Would just the BOUT++ internal timings sufficient? Would you like mpi runs?
Sure, having them makes things more easily readable. But that is optional, and I can certainly add some if you think this is worth merging an wanted before merging. It would make
I would not know how. If you can explain how to, I am happy to change the design.
I can split them out 👍 |
ZedThree
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still a bit wary of this because it feels like this whole other system that needs to be everywhere for it to be useful, and the current implementation is a nice demonstration of it for a single, limited purpose.
If we're going to have it and start using it, then we definitely need some docs in the debugging section, more docstrings in the code, a bit more polish on the API, and some tests.
I'd also still really like to see some proper benchmarks comparing next and this branch.
|
|
||
| int tracking_state{0}; | ||
| Options* tracking{nullptr}; | ||
| std::string selfname; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this if we already have name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added:
// name is changed if we assign to the variable, while selfname is a
// non-changing copy that is used for the variable names in the dump files
|
|
||
| Field3D& Field3D::operator=(Field3D&& rhs) { | ||
| TRACE("Field3D: Assignment from Field3D"); | ||
| track(rhs, "operator="); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really think this should be a macro that is only enabled at higher CHECK
| #undef FIELD_FUNC | ||
|
|
||
| template <typename T, typename = bout::utils::EnableIfField<T>, class... Types> | ||
| inline T setName(T&& f, const std::string& name, Types... args) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To make this a member function, make it virtual:
field.hxx:
virtual Field& setName(std::string name) {
self->name = std::move(name);
return *self;
}field2d.hxx:
Field2D& setName(std::string name) override {
Field::setName(name);
return *self;
}Formatting will have to be done at the calling site, but I think that's fine
include/bout/options.hxx
Outdated
| std::string toString(const Options& value); | ||
|
|
||
| /// Save the parallel fields | ||
| void saveParallel(Options& opt, const std::string name, const Field3D& tosave); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this be a method rather than a free function? Then it can be called like options[name].assignParallel(value)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think that works, as we assign to name as well as f"{name}_y+1" etc ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would options.assignParallel(name, value) be better?
|
Exposing the additional PVODE options is just obvious and should probably just be it's own PR as well |
|
@ZedThree just wanted to chime in and say that having this capability available would be extremely useful for me in optimising Hermes-3. I had already played with it quite a bit, but unfortunately I had to put it on the back burner for now because of the DLS paper and other work. I fully intend to do proper testing of this to optimise Hermes-3 performance sometime in Jan-Feb, or earlier if Michael Hardman can have a go. I understand that there could be performance concerns - just wanted to say that this is much more than a small debugging benefit for me. |
…vcode-cvode-improvements
* switch to std::weak_ptr * reorder branches in pvode debug * make track a trivial inline function * ensure _track template is only used for fields
dschwoerer
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for your review @ZedThree
I forgot that this was pending on me :-(
I have addressed everything and this should be ready for re-review. Again, sorry for the delay.
| #undef FIELD_FUNC | ||
|
|
||
| template <typename T, typename = bout::utils::EnableIfField<T>, class... Types> | ||
| inline T setName(T&& f, const std::string& name, Types... args) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would really love this to be not be done on the caller side, because then the formatting can be easily disabled, and the overhead disappears. If that is not done, in order to avoid the overhead of name tracking, all the calls need to be done using #if guards.
from https://en.cppreference.com/w/cpp/language/member_template
A member function template cannot be virtual, and a member function template in a derived class cannot override a virtual member function from the base class.
So I guess we would be stuck with not having the member function on the base class, but only on the derived class.
|
|
||
| int tracking_state{0}; | ||
| Options* tracking{nullptr}; | ||
| std::string selfname; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added:
// name is changed if we assign to the variable, while selfname is a
// non-changing copy that is used for the variable names in the dump files
include/bout/options.hxx
Outdated
| std::string toString(const Options& value); | ||
|
|
||
| /// Save the parallel fields | ||
| void saveParallel(Options& opt, const std::string name, const Field3D& tosave); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would options.assignParallel(name, value) be better?
| dump_at_time((*options)["dump_at_time"] | ||
| .doc("Dump debug info about the simulation") | ||
| .withDefault(-1)) {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I found this useful. But certainly less useful then the pvode changes.
But as euler is anyway only mostly a toy solver, I thought I add it.
It allows to dump everything after a given number of steps.
I used this for testing why two different implementation behave differently, and for that I needed to dump after 0 or 1 time step. But it is likely not as generally useful as the changes to pvode. I can remove the changes if you would like to keep euler clean.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
There were too many comments to post at once. Showing the first 25 out of 50. Check the log or trigger a new build to see more.
…vcode-cvode-improvements
…oject/BOUT-dev into pvcode-cvode-improvements
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
There were too many comments to post at once. Showing the first 25 out of 29. Check the log or trigger a new build to see more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
| const size_t numberParallelSlices = | ||
| tosave.hasParallelSlices() ? 0 : tosave.getMesh()->ystart; | ||
| for (size_t i0 = 1; i0 <= numberParallelSlices; ++i0) { | ||
| for (int i : {i0, -i0}) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: narrowing conversion from 'unsigned long' to signed type 'int' is implementation-defined [bugprone-narrowing-conversions]
for (int i : {i0, -i0}) {
^There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤷 Unfortunately there is no nice way to avoid this ...
|
That took me much longer then it should have, sorry. I hope this is now good to review / merge. Some comments that were (maybe) still open:
debug_on_failure has roughly no overhead if it is not used right now. So it could always be enabled, if the user would be interested in that feature. What does add some overhead, is the tracking feature. That does not only make the debug files more useful, but also makes other debugging much easier, if the name gives you an idea what field you look at right now. |