New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

direct gather routine for single vector field #5776

Merged

JustinRayAngus merged 24 commits into BLAST-WarpX:development from JustinRayAngus:do_gather_B

Mar 25, 2025

+299 −204

Contributor

JustinRayAngus commented Mar 17, 2025 •

edited

Loading

A direct gather routine for B is used in doGatherPicnic() (only used by implicit solver) and will also be used in the mass matrix deposit routine in PR #5768. Making a separate routine for a direct Gather of the B-field will reduce code duplication.

This PR adds a direct gather routine for a single vector field, which can be used for E or B and accounts for the galerkin_interpolation flag in parallel and perpendicular directions.

@RemiLehe This PR also seems to overlap with PR #5673.

JustinRayAngus added the component: implicit solvers label

JustinRayAngus assigned dpgrote and RemiLehe

JustinRayAngus force-pushed the do_gather_B branch from 419ded3 to d615692 Compare

March 18, 2025 15:44

Member

dpgrote commented Mar 18, 2025

This routine seems to duplicate what is in doGatherShapeN (except for the Galerkin factor) which is doing direct deposition. It looks like a single kernel could be written that gathers the three field components that would be called separately for E and B. It would need parallel and perpendicular Galerkin factors for the E and B.

JustinRayAngus force-pushed the do_gather_B branch from dac35c8 to f8fa09b Compare

March 18, 2025 21:14

Contributor Author

JustinRayAngus commented Mar 18, 2025 •

edited

Loading

This routine seems to duplicate what is in doGatherShapeN (except for the Galerkin factor) which is doing direct deposition. It looks like a single kernel could be written that gathers the three field components that would be called separately for E and B. It would need parallel and perpendicular Galerkin factors for the E and B.

I noticed that too. Originally, this routine was optimized for the implicit solver where a Yee grid is assumed. Now that I have made the routine more generalized (and less optimized), it is now basically the same as doGatherShapeN for a single vector field without the galerkin_interpolation flag. There is no need to use that flag for B. What do you suggest we do?

JustinRayAngus force-pushed the do_gather_B branch from f8fa09b to 6346f6d Compare

March 19, 2025 15:31

Contributor Author

JustinRayAngus commented Mar 19, 2025 •

edited

Loading

@dpgrote I've generalized the routine for an arbitrary vector field F with separate depos_order template parameters for parallel and perpendicular gathers.

Contributor Author

JustinRayAngus commented Mar 19, 2025 •

edited

Loading

FYI. The file diff is very misleading now. I added the function doGatherVectorFieldShapeN() above doGatherShapeN() (which is required) and then modified the latter to call the former. However, the file diff recognizes this as many separate changes to the doGatherShapeN() routine rather than one green block for doGatherVectorFieldShapeN() and then another few red/green blocks showing the modifications to doGatherShapeN().

JustinRayAngus added cleaning component: core labels

JustinRayAngus assigned dpgrote, ax3l and RemiLehe and unassigned dpgrote and RemiLehe

JustinRayAngus changed the title ~~direct gather routine for magnetic field~~ direct gather routine for single vector field

dpgrote reviewed

View reviewed changes

Source/Particles/Gather/FieldGather.H Outdated

               AMREX_GPU_HOST_DEVICE AMREX_FORCE_INLINE
-              void doGatherShapeN ([[maybe_unused]] const amrex::ParticleReal xp,
+              void doGatherVectorFieldShapeN (

Member

dpgrote Mar 20, 2025

To be more precise, can this name be changed to doDirectGatherVectorFieldShapeN. And the same for doGatherShapeN, change to doDirectGatherShapeN.

Contributor Author

JustinRayAngus Mar 20, 2025

I'm ok with renaming the new function, but I'm hesitant to change doGatherShapeN. Seems there are a lot of functions that start with that, so you could argue that all of them should change as well. That seems like something for a separate PR.

Also, why have ShapeN in the function names? I feel like it adds no value here.

dpgrote reviewed

View reviewed changes

Source/Particles/Gather/FieldGather.H Outdated

-                                  ex_arr(lo.x+j_ex+ix, lo.y+k_ex+iy, lo.z+l_ex+iz);
+                  // Gather vector field F
+                  amrex::Real weight;
+                  for (int i=0; i<=depos_order_para; i++) {

Member

dpgrote Mar 20, 2025

For optimization, the loop ordering should be swapped, with i the innermost loop. The same for all of the nested loops.

Contributor Author

JustinRayAngus Mar 20, 2025

I didn't know this. The Villasenor deposition routine also uses the bad ordering. Should that be changed as well? Separate PR of course.

dpgrote reviewed

View reviewed changes

Source/Particles/Gather/FieldGather.H Outdated

                       }
                   }
+              #if defined(WARPX_DIM_RZ)
+                  // Convert Fxp and Fyp (which are actually Fr and Fth) to Fx and Fy

Member

dpgrote Mar 20, 2025 •

edited

Loading

This routine is adding to previously calculated fields, so for RZ, this transformation needs to be done on only the fields calculated in this routine. The fields need to be accumulated into temporaries and the transform done on those before adding them to the total fields. You can see this in the original `doGatherShapeN. Also, note that there the XZ and RZ are fully separated.

Contributor Author

JustinRayAngus Mar 20, 2025 •

edited

Loading

I think this routine does what you describe. It is also identical to how it is done in doGatherShapeNEsirkepovStencilImplicit() and in doGatherPicnicShapeN().

In either case, I made a small change that I think makes it more clear.

Contributor Author

JustinRayAngus Mar 21, 2025

Discussed offline. Fixed now.

dpgrote reviewed

View reviewed changes

Source/Particles/Gather/FieldGather.H Outdated Show resolved Hide resolved

JustinRayAngus force-pushed the do_gather_B branch 4 times, most recently from 285aec2 to 13e20d6 Compare

March 24, 2025 21:29

Contributor Author

JustinRayAngus commented Mar 24, 2025 •

edited

Loading

Documenting recent commits. To reduce code duplication, we originally changed doGatherShapeN() to call the new doDirectGatherSingleField() once for E and once for B. @dpgrote and myself did some timing tests on CPU and GPU (Lassen and Perlmutter) using the 3D uniform_plasma input deck. There was no noticeable difference on CPU, but there was a 36-50% increase in the GatherAndPush timings on GPU. The increase seems to be because the shape factors in each direction are now computed twice.

In order to not affect timing associated with GatherAndPush, doGatherShapeN() is restored to its original form and the new routine is only called by the implicit routines as needed. However, this means that there is a fair amount of duplicate code.

JustinRayAngus requested a review from dpgrote

March 25, 2025 00:14

dpgrote reviewed

View reviewed changes

Source/Particles/Gather/FieldGather.H Outdated

+                  }
+                  // Convert Frp and Fthp to Fxp and Fyp
+                  Fxp = costheta*Frp  - sintheta*Fthp;

Member

dpgrote Mar 25, 2025

These should be += to be consistent with the fields being accumulated.

dpgrote reviewed

View reviewed changes

Source/Particles/Gather/FieldGather.H

@@ @@ -495,22 +753,10 @@ void doGatherShapeNEsirkepovStencilImplicit ( @@
               #if defined(WARPX_DIM_RZ)
                   amrex::Real const xp_new = xp_np1;
                   amrex::Real const yp_new = yp_np1;
-                  amrex::Real const xp_mid = xp_nph;

Member

dpgrote Mar 25, 2025

The changes in this routine are unrelated to the main topic of the PR so should be reverted to keep the PR minimal.

Contributor Author

JustinRayAngus Mar 25, 2025

Since it is just a small change and makes things consistent with the cleaner implementation of the new routine, I would prefer to keep it.

dpgrote reviewed

View reviewed changes

Source/Particles/Gather/FieldGather.H

		@@ -15,6 +15,264 @@

		#include <AMReX.H>

		/**

Member

dpgrote Mar 25, 2025

Since this is a somewhat special purpose routine, maybe move this further down in the file, perhaps to just before doGatherPicnicShapeN.

Contributor Author

JustinRayAngus Mar 25, 2025

If we decide to use the new routine in doGatherShapeN, then it has to be before that routine. If I move it and someone wants to test out the timings again by calling it from doGatherShapeN, then they would have to move it back to where it is now.

JustinRayAngus added 24 commits

March 25, 2025 10:12


          created direct gather function for magnetic field.

122d100


          adding n_rz_azimuthal_modes

e45aee3


          using more general notation.

7258c21


          minor cleanup.

384896f


          passing B_type to doGatherBShapeN()

57e0a62


          generalized doGatherB to account for type (CELL or NODE)

e04cc0d


          removed maybe_unused

e600699


          bug fix. else if ==> if.

8a857db


          generalized doGatherBShapeN==>doGatherVectorFieldShapeN for arbitrary…

753c923

… vector field F with different shape factors in perpendicular and parallel directions


          using doGatherVectorFieldShapeN() in doGatherShapeN().

b5b6ef9


          remove maybe_unused lines.

92caf98


          removed unused overloaded doGatherShapeN().

d134568


          optimize loop order.

c342cd7


          cleaner costheta and sintheta

bc14bb8


          clearer transformation from r-th to x-y

3e430a4


          cleaner costheta and sintheta for other functions.

641c203


          routine name change.

f092843


          using original RZ implementation.

204eca6


          bug fix.

aea8b2e


          fixed sign bug in xy

b6f33df


          fix comments. clean up.

60b90fc


          consistent type.

7ae466b


          restoring original doGatherShapeN. New routine only used by implicit …

f7014e1

…solvers.


          += bug fix for converting from Fr,Fth to Fx,Fy.

JustinRayAngus force-pushed the do_gather_B branch from bdcae30 to 3296277 Compare

March 25, 2025 17:12

dpgrote approved these changes

View reviewed changes

Member

dpgrote left a comment

Looks good to me!

JustinRayAngus merged commit 404b444 into BLAST-WarpX:development

36 checks passed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cleaning component: core component: implicit solvers