Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add template #1

Merged
merged 38 commits into from
Mar 12, 2025
Merged

Add template #1

merged 38 commits into from
Mar 12, 2025

Conversation

irm-codebase
Copy link
Member

@irm-codebase irm-codebase commented Mar 4, 2025

Initial template for data modules. Mostly based on ec-modules.

Features:

  • Both the template repo and the created template use pixi as the package manager
  • This template repo has:
    • copier questions to pre fill licensing, citation and key bits of the code
    • pytest testing to check that the produced template works (docs build, pre-made example runs, pre-made testing runs).
  • The created template has:
    • A simple 'dummy' example of a modular workflow
    • A set of integration tests for:
      • key files (e.g., INTERFACE.yaml) should fit schemas
      • the all rule throwing a specific error if the module is called as a regular workflow
      • a lightweight test showing how the module can be used
    • ReadTheDocs ready documentation with a 'specification' section that all data modules should keep as-is.

@irm-codebase irm-codebase marked this pull request as draft March 4, 2025 21:34
@irm-codebase irm-codebase marked this pull request as ready for review March 4, 2025 21:47
@irm-codebase
Copy link
Member Author

irm-codebase commented Mar 4, 2025

@brynpickering @sjpfenninger :
We are waiting for conda-forge to approve clio, so this will have a couple of changes.

Still, I thought it would be good to review it as it is. The only changes would be removing some dependencies.

Lots of files, but the important ones are:

  • pixi.toml
  • template/
  • copier.yaml
  • tests/
  • README.md (I decided to not use mkdocs for this repo, to keep it simple)

@irm-codebase
Copy link
Member Author

@brynpickering @sjpfenninger
I just realized that snakemake requires users to install conda (pretty obvious)...
Using two package managers for this might not be ideal, even if pixi is definitely better.

I will revert this to use conda

@brynpickering
Copy link
Member

brynpickering commented Mar 5, 2025

I just realized that snakemake requires users to install conda (pretty obvious)...

You can also just include conda in the pixi dependencies alongside snakemake as suggested in the snakemake docs.

Although I can see the benefit of sticking with one manager and one way to look at setting dependencies (environment.yml files). Some of the benefits of pixi (namely tasks) can be handled natively by snakemake rules anyway.

@irm-codebase
Copy link
Member Author

You can also just include conda in the pixi dependencies alongside snakemake as suggested in the snakemake docs.

Although I can see the benefit of sticking with one manager and one way to look at setting dependencies (environment.yml files). Some of the benefits of pixi (namely tasks) can be handled natively by snakemake rules anyway.

Installing recursive environments seems... odd. But losing lock files for the 'project' portion of the data module might be worse... I'll try adding mamba to the pixi dependencies.

If that still causes problems I'll revert to mamba

@irm-codebase
Copy link
Member Author

@brynpickering thanks for the suggestions! It's now working like a charm.

I've added a few improvements to copier.yaml, and a CI workflow for pull requests.
I think this is mature enough, for now.

The only change would be updating the dependencies when clio-tools is published in conda forge.

@irm-codebase
Copy link
Member Author

@brynpickering @sjpfenninger
clio-tools was just published in conda-forge, which means this is ready for the final review!

@irm-codebase
Copy link
Member Author

irm-codebase commented Mar 7, 2025

@sjpfenninger @brynpickering
About having the documentation in individual repos or in one place...

In the future, maybe we could have both if all the functions in template/{{ module_short_name }}/docs/hooks/clio_standard.py.jinja are moved to clio-tools.
The template could just call it and pass some values, and on the clio side maybe we could use a list and download key files of the latest release to build the docs there.

The only problem with this is that if the clio specification changes (some file is removed, a schema changes), we would need to take modules off our docs until they update.

Copy link
Member

@brynpickering brynpickering left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! I'll try building it on my own machine. In the meantime, there's some minor comments

Copy link
Member

@brynpickering brynpickering left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made some minor changes following a local build. LGTM!

Copy link
Member

@sjpfenninger sjpfenninger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good to me. I made two specific suggestions for making it more explicit that the short name will be used as the directory name for the newly-created module.

@irm-codebase
Copy link
Member Author

@brynpickering @sjpfenninger
Suggestions included.

The last bit is the documentation comment that @sjpfenninger made.
After using the template, I'd have to agree that requiring it is too much overhead.
It's fine to leave it, but as an a nice 'extra'.

I've just removed the generated diagram.
That bit has already proved to be too finicky. calliope-project/clio#8

@sjpfenninger
Copy link
Member

For documentation: GitHub renders Markdown well, so why not just expect that the documentation of each module is always rendered viewed on the GitHub repository, coupled with a manually populated and maintained list of modules e.g. somewhere on https://clio.readthedocs.io/ (which is where as a user, I'd expect to see the available modules, anyway). More fancy things are always possible later.

@irm-codebase
Copy link
Member Author

@sjpfenninger agreed.

I've simplified things by making all the mkdocs stuff optional, and clarifying important files.
We can later just use these files to auto-generate documentation at clio if we choose to.

This reduces overhead while still ensuring we can provide nice module descriptions in the future.

@irm-codebase irm-codebase merged commit 1c6db54 into main Mar 12, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants