Skip to content

Conversation

@wenzeslaus
Copy link
Member

@wenzeslaus wenzeslaus commented Jun 11, 2025

This is adding r.pack files (aka native GRASS raster files) as input and output to tools when called through the Tools object. Tool calls such as r_grow can take r.pack files as input or output. The format is distinguished by the file extension.

Notably, tool calls such as r_mapcalc don't pass input or output data as separate parameters (expressions or base names), so they can be used like that only when a wrapper exists (r_mapcalc_simple) or, in the future, when more information is included in the interface or passed between the tool and the Tools class Python code. Similarly, tools with multiple inputs or outputs in a single parameter are currently not supported.

The code is using --json with the tool to get the information on what is input and what is output, because all are files which may or may not exists (this is different from NumPy arrays where the user-provided parameters clearly say what is input (object) and what is output (class)). Consequently, the whole import-export machinery is only started when there are files in the parameters as identified by the parameter converter class.

Currently, the in-project raster names are driven by the file names. This will break for parallel usage and will not work for vector as is. While it is good for guessing the right (and nice) name, e.g., for r.mapcalc expression, ultimately, unique names retrieved with an API function are likely the way to go.

When cashing is enabled (either through use go context manager or explicitly), import of inputs is skipped when they were already imported or when they are known outputs. Without cache, data is deleted after every tool (function) call. Cashing is keeping the in-project data in the project (as opposed to a hidden cache or deleting them). The parameter to explicitly drive this is called use_cache (originally keep_data).

The objects track what is imported and also track import and cleaning tasks at function call versus object level. The data is cleaned even in case of exceptions. The interface was clarified by creating a private/protected version of run_cmd which has the internal-only parameters. This function uses a single try-finally block to trigger the cleaning in case of exceptions.

While generally the code supports paths as both strings and Path objects, the actual decisions about import are made from the list of strings form of the command.

From caller perspective, overwrite is supported in the same way as for in-project GRASS rasters.

The tests use module scope to reduce fixture setup by couple seconds. Changes include a minor cleanup of comments in tests related to testing result without format=json and with, e.g., --json option.

The class documentation discusses overhead and parallelization because the calls are more costly and there is a significant state of the object now with the cache and the rasters created in the background. This includes discussion of the NumPy arrays, too, and slightly improves the wording in part discussing arrays.

This is building on top of #2923 (Tools API, and it is parallel with #5878 (NumPy array IO), although it runs at a different stage than NumPy array conversions and uses cache for the imported data (may be connected more with the arrays in the future). This can be used efficiently in Python with Tools (caching, assuming project) and in a limited way also with the experimental run subcommand in CLI (no caching, still needs an explicit project). There is more potential use of this with the standalone tools concept (#5843). The big picture is also discussed in #5830.

Example

import grass.script as gs
from grass.tools import Tools

gs.create_project("data/project", crs="data/elevation.grass_raster")
with gs.setup.init("data/project") as session, Tools(session=session) as tools:
    tools.g_region(raster="data/elevation.grass_raster")
    tools.r_slope_aspect(
        elevation="data/elevation.grass_raster", slope="data/slope.grass_raster"
    )
    statistics = tools.r_univar(map="data/slope.grass_raster", format="json")
print(statistics["mean"])

The file data/elevation.grass_raster is imported only once and reused for the r_slope_aspect call. data/slope.grass_raster is exported, but not reimported for the r_univar call. At the end of the context, the in-project rasters are removed, but the data/slope.grass_raster file exists.

wenzeslaus and others added 28 commits June 3, 2023 23:57
This adds a Tools class which allows to access GRASS tools (modules) to be accessed using methods. Once an instance is created, calling a tool is calling a function (method) similarly to grass.jupyter.Map. Unlike grass.script, this does not require generic function name and unlike grass.pygrass module shortcuts, this does not require special objects to mimic the module families.

Outputs are handled through a returned object which is result of automatic capture of outputs and can do conversions from known formats using properties.

Usage example is in the _test() function in the file.

The code is included under new grass.experimental package which allows merging the code even when further breaking changes are anticipated.
…needs to be improved there). Allow usage through attributes, run with run_command syntax, and subprocess-like execution.
…est a tool when there is a close match for the name.
…tool in XY project, so useful only for things like g.extension or m.proj, but, with a significant workaround for argparse --help, it can do --help for a tool.
@wenzeslaus wenzeslaus added the enhancement New feature or request label Jun 11, 2025
@github-actions github-actions bot added the Python Related code is in Python label Jun 11, 2025
@wenzeslaus
Copy link
Member Author

Now also this workflow is possible, reusing the just-created pack file:

import grass.script as gs
from grass.tools import Tools

with gs.setup.init("nc_spm_08_grass7/user1") as session, Tools(session=session) as tools:
    tools.g_region(raster="data/elevation.grass_raster")
    tools.r_slope_aspect(elevation="data/elevation.grass_raster", slope="data/slope.grass_raster")
    statistics = tools.r_univar(map="data/slope.grass_raster")

The file data/elevation.grass_raster is imported only once and reused for the r_slope_aspect call. data/slope.grass_raster is exported, but not reimported for the r_univar call.

@wenzeslaus
Copy link
Member Author

With project creation from pack files (#6415), the following is now possible (assuming file data/elevation.grass_raster exists):

import grass.script as gs
from grass.tools import Tools

gs.create_project("data/project", crs="data/elevation.grass_raster")
with gs.setup.init("data/project") as session, Tools(session=session) as tools:
    tools.g_region(raster="data/elevation.grass_raster")
    tools.r_slope_aspect(
        elevation="data/elevation.grass_raster", slope="data/slope.grass_raster"
    )
    statistics = tools.r_univar(map="data/slope.grass_raster", format="json")
print(statistics["mean"])

The above can be copied to a script or an ipython console and will give 3.864522406673346. You can quickly get an up-to-date GRASS native file using:

grass --tmp-mapset ~/grassdata/nc_spm_08_grass7/ --exec r.pack input=elevation output=data/elevation.grass_raster

An equivalent procedure in interactive shell with NC dataset is:

grass --tmp-mapset ~/grassdata/nc_spm_08_grass7/
g.region raster=elevation
r.slope.aspect elevation=elevation slope=slope
r.univar map=slope format=json | jq .mean

And will give 3.8645224066733461.

@wenzeslaus wenzeslaus requested a review from petrasovaa October 5, 2025 02:54
wenzeslaus added a commit that referenced this pull request Oct 6, 2025
This enables access to the subcommands from the main _grass_ command.

It keeps the commands not documented at the top level, so they remain hidden and experimental.

Backwards compatibility with the classic CLI parameters should be smooth. The decision is based on the first command line argument. This will yield unexpected results (only) when path to mapset matches one of the subcommands, e.g., when naming mapset simply mapset and running the _grass_ command in the project directory which contains this mapset (same for project). It may become a bigger issue with more subcommands, but a simple workaround is prefixing the path `./`.

This includes subcommands which are in other PRs, but the subcommand parser will deal with that by reporting an error. This could be an approach we take in general: reserving subcommand names even when we don't have them implemented yet (and triggering some of the directory (project, mapset, filename) issues sooner (even before a specific subcommand is fully introduced).

This will is useful together with project create subcommand (#6441) and raster pack IO (#5877).

Before (assuming PYTHONPATH or FHS): python -m grass.app run --crs EPSG:3358 g.proj -p

After (assuming PATH): grass run --crs EPSG:3358 g.proj -p
Co-authored-by: Anna Petrasova <[email protected]>
Copy link
Contributor

@petrasovaa petrasovaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will open it for eventual adding other formats which will be useful for many users.

@wenzeslaus wenzeslaus enabled auto-merge (squash) October 6, 2025 14:54
@wenzeslaus wenzeslaus added this to the 8.5.0 milestone Oct 6, 2025
@wenzeslaus wenzeslaus merged commit 8ca3c7a into OSGeo:main Oct 7, 2025
30 of 32 checks passed
@wenzeslaus wenzeslaus deleted the add-pack-files-io-to-tools branch October 7, 2025 01:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request libraries Python Related code is in Python tests Related to Test Suite

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants