Skip to content

Conversation

@erikvansebille
Copy link
Member

@erikvansebille erikvansebille commented Jul 29, 2025

One way to improve performance in Parcels is to 'vectorize' the kernels: i.e. to not make kernels loop over particles, but to have them act on the entire particles. This PR is an implementation of that approach.

The kernel loop has been completely rewritten, and for Parcels users there will be a few important changes

  1. Since the input of a kernel is a list of Particles (technically now a boolean mask of KernelParticles; but that may change in the refactor of Refactor particleset attributes and particle data for vectorized kernels #2143), users can't use if-statements anymore in their Kernels. Instead, they will need to apply np.where() or masked indexing. But note that even complex Kernels like RK45 with adaptive times-tapping are still possible
  2. The signature of a Kernel may need to change, because a) the input is not one particle but a set of particles and b) time does not have much meaning anymore when all particles could potentially have their own time. See also Reconsider Kernel signature for Parcels v4 #2147
  3. Users and developers may need to be a bit more careful of memory constraints. Since many variables in the Kernel loop are now arrays of length Nparticles, the memory requirements of temporary variables is now much more likely to grow quickly.
  4. The return value of a Kernel does not mean anything anymore, as particles can have different return states. Errors are therefore handled by their state
  5. Particle-particle interaction would potentially be much easier to be implemented, since all particles are available in the kernel (to be explored)

The performance of this PR has been explored in Parcels-code/parcels-benchmarks#2 (comment)

  • Chose the correct base branch (main for v3 changes, v4-dev for v4 changes)

@fluidnumerics-joe
Copy link
Contributor

Quite excited to see the performance diff here!

@erikvansebille
Copy link
Member Author

Quite excited to see the performance diff here!

Check out Parcels-code/parcels-benchmarks#4 (comment)!

@erikvansebille
Copy link
Member Author

Hi @fluidnumerics-joe, could you try to fix the breaking unit test in pytest -k "test_uxstommelgyre_pset_execute"?

It seems related to the fact that x and y in the line below are now arrays (instead of floats)
https://github.com/OceanParcels/Parcels/blob/bef8ee3cfc18c40cb65bb840985ab41e731548b3/parcels/uxgrid.py#L99

ValueError: Points are neither Cartesian (shape N x 3) nor Spherical (shape N x 2). ```python =================================== FAILURES =================================== _______________________ test_uxstommelgyre_pset_execute ________________________
def test_uxstommelgyre_pset_execute():
    ds = datasets_unstructured["stommel_gyre_delaunay"]
    grid = UxGrid(grid=ds.uxgrid, z=ds.coords["nz"])
    U = Field(
        name="U",
        data=ds.U,
        grid=grid,
        mesh_type="spherical",
        interp_method=UXPiecewiseConstantFace,
    )
    V = Field(
        name="V",
        data=ds.V,
        grid=grid,
        mesh_type="spherical",
        interp_method=UXPiecewiseConstantFace,
    )
    P = Field(
        name="P",
        data=ds.p,
        grid=grid,
        mesh_type="spherical",
        interp_method=UXPiecewiseConstantFace,
    )
    UV = VectorField(name="UV", U=U, V=V)
    fieldset = FieldSet([UV, UV.U, UV.V, P])
    pset = ParticleSet(
        fieldset,
        lon=[30.0],
        lat=[5.0],
        depth=[50.0],
        time=[np.timedelta64(0, "s")],
        pclass=Particle,
    )
  pset.execute(
        runtime=np.timedelta64(10, "m"),
        dt=np.timedelta64(60, "s"),
        pyfunc=AdvectionEE,
    )

tests/v4/test_particleset_execute.py:153:


parcels/particleset.py:815: in execute
res = self._kernel.execute(self, endtime=next_time, dt=dt)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
parcels/kernel.py:230: in execute
self.evaluate_pset(pset, endtime)
parcels/kernel.py:303: in evaluate_pset
res_tmp = f(pset, self._fieldset, pset.time_nextloop[0])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
parcels/application_kernels/advection.py:103: in AdvectionEE
(u1, v1) = fieldset.UV[particle]
^^^^^^^^^^^^^^^^^^^^^
parcels/field.py:377: in getitem
return self.eval(key.time, key.depth, key.lat, key.lon, key)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
parcels/field.py:350: in eval
position = self.grid.search(z, y, x, ei=_ei)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
parcels/uxgrid.py:99: in search
face_ids = self.uxgrid.get_faces_containing_point([x, y], return_counts=False)[0]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
../../anaconda3/envs/parcels-v4/lib/python3.12/site-packages/uxarray/grid/grid.py:2536: in get_faces_containing_point
source_grid=self, points=points_atleast_2d_xyz(points)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^


points = array([[30.],
[ 5.]], dtype=float32)

def points_atleast_2d_xyz(points):
    """
    Ensure the input is at least 2D and return Cartesian (x, y, z) coordinates.

    Parameters
    ----------
    points : array_like, shape (N, 2) or (N, 3)
        - If shape is (N, 2), interpreted as [longitude, latitude] in degrees.
        - If shape is (N, 3), interpreted as Cartesian [x, y, z] coordinates.

    Returns
    -------
    points_xyz : ndarray, shape (N, 3)
        Cartesian coordinates [x, y, z] for each input point.

    Raises
    ------
    ValueError
        If `points` (after `np.atleast_2d`) does not have 2 or 3 columns.

    """

    points = np.atleast_2d(points)

    if points.shape[1] == 2:
        points_lonlat_rad = np.deg2rad(points)
        x, y, z = _lonlat_rad_to_xyz(points_lonlat_rad[:, 0], points_lonlat_rad[:, 1])
        points_xyz = np.vstack([x, y, z]).T
    elif points.shape[1] == 3:
        points_xyz = points
    else:
      raise ValueError(
            "Points are neither Cartesian (shape N x 3) nor Spherical (shape N x 2)."
        )

E ValueError: Points are neither Cartesian (shape N x 3) nor Spherical (shape N x 2).

../../anaconda3/envs/parcels-v4/lib/python3.12/site-packages/uxarray/grid/coordinates.py:753: ValueError

</details>

@fluidnumerics-joe
Copy link
Contributor

Taking a look. I think these just need to be column stacked... "just" is a just a famous last word of course..

To handle array of lat,lon the x and y values need to be column stacked.
After making this change, the internal API funciton in uxgrid
(grid.neighbors._triangle_area) fails due to an assumed shape for the
coordinate (being a scalar and not an array). The quick workaround here
is to bring the `_barycentric_coordinates` method into parcels.uxgrid
and leverage the numpy-ified parcels.uxgrid._triangle_area method we
already have defined.

In making this change, since order of operations is different for the
barycentric coordinate calculation, this required updating the rounded
and hashed values in the stommel gyre test. The actual values for the
lat/lon still appear to be ok.
@fluidnumerics-joe
Copy link
Contributor

@erikvansebille - The stommel gyre test now passes - see the commit message in cb702e1 for details

@erikvansebille
Copy link
Member Author

@erikvansebille - The stommel gyre test now passes - see the commit message in cb702e1 for details

Do you want to also cherry-pick these changes into another PR too? Note that this PR may never be merged, if we decide that vectorization is not the way forward for performance; so this code may never get merged. Or is the change in the code not per se an improvement?

@fluidnumerics-joe
Copy link
Contributor

I'd say it's an improvement in the sense we have more control within parcels to improve barycentric performance calculation without having to go through uxarray internal api. I'll make a note to get another PR in this afternoon with just that commit

and also reverting some of the unit test assert changes (incl bac89b6); that are not needed anymore
And cleaning up the existing interpolators
By broadcasting to avoid repeated array access in bcoord computation

Also, adding a note that searching can be much quicker when grid spacing is uniform
@VeckoTheGecko
Copy link
Contributor

(+ updating the description with a summary of changes would be great)

Copy link
Contributor

@VeckoTheGecko VeckoTheGecko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor changes and comments for a future PR, nothing majorly blocking

@erikvansebille erikvansebille merged commit 62a596c into v4-dev Aug 19, 2025
9 checks passed
@github-project-automation github-project-automation bot moved this from Backlog to Done in Parcels development Aug 19, 2025
@erikvansebille erikvansebille deleted the vectorized-kernel branch August 19, 2025 18:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

4 participants