Skip to content

enable composing input data #105

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Sep 9, 2024
Merged

enable composing input data #105

merged 1 commit into from
Sep 9, 2024

Conversation

juliasloan25
Copy link
Member

@juliasloan25 juliasloan25 commented Sep 5, 2024

Purpose

closes #41

This PR adds functionality to DataHandler, TimeVaryingInputs, and SpaceVaryingInputs to enable composing multiple input variables into one data variable. To do this, the user must specify a composing function, multiple variable names, and the file paths where they can be read from.

Most of the changes have been made at the DataHandler level. Each input variable has its own unique FileReader object, and each composed data variable has one Time/SpaceVaryingInput and one DataHandler. The composing function itself is applied in the regridded_snapshot function, just before regridding. The user will interact with this feature at the Time/SpaceVaryingInput level.

This feature is only available when using InterpolationsRegridder, not TempestRegridder.

Design decisions made include:

  • If a pre-processing function is provided, it is applied to each input variable before they are composed.
  • Variables are composed before regridding, to preserve higher resolution information
  • We assume that all input variables have the same temporal and spatial dimensions. This is explicitly checked in the DataHandler constructor, and will raise an informative error message if it is not true.
  • Multiple input variables can come from one file, or each from their own unique file. We don't currently support arbitrary numbers of input variables and files, since this would require more work to implement and is not an expected use case in the near term.

To-do

  • DataHandler: add field to store Dict mapping input varnames to FileReaders
  • DataHandler: add field to store list of input varnames
  • DataHandler: add field to store composing function
  • DH constructor: assert number of file paths supplied == number of varnames
  • DH constructor: assert compose function is provided when multiple input vars provided
  • DH regrid_snapshot: apply compose function to input vars, regrid result and return it
  • SVI/TVI constructors: add fields for multiple file paths, multiple varnames, composing function (optional)
  • update all docs
  • add usage example to docs

Tests

  • DH constructor: num file_paths != num varnames && num file_paths > 1 (should fail)
  • DH constructor: multiple input vars, no compose func (should fail)
  • DH constructor: 1 input var, compose func (should fail)
  • DH constructor: TempestRegrid with multiple vars (should fail)
  • DH constructor: 1 input var, no compose func (should pass)
  • DH constructor: multiple input vars, compose func (should pass)
  • SVI/TVI constructors with + without varnames

Content


  • I have read and checked the items on the review checklist.

@juliasloan25 juliasloan25 changed the title DH, SVI some changes enable composing input data Sep 5, 2024
@Sbozzolo
Copy link
Member

Sbozzolo commented Sep 5, 2024 via email

@juliasloan25
Copy link
Member Author

We should support passing only one file path with multiple variables: thats
the most common use case

Okay, I'll change it to support that. Do we also want to support passing multiple file paths for multiple variables? I think this would become complicated quickly because there may not be a one-to-one mapping, i.e. in the case num file paths > 1 && num variables > 1 && num file paths != num variables, we would need more infrastructure to map variables to file paths

@Sbozzolo
Copy link
Member

Sbozzolo commented Sep 5, 2024

We can just do:

  • 1 file, multiple variables,
  • N files, N variables

We can add more if need arises.

@juliasloan25 juliasloan25 force-pushed the js/compose-inputs branch 3 times, most recently from 92c66fe to ab5bd13 Compare September 6, 2024 18:31
@juliasloan25 juliasloan25 force-pushed the js/compose-inputs branch 3 times, most recently from cf098c2 to d39a234 Compare September 6, 2024 23:28
Copy link
Member

@Sbozzolo Sbozzolo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

I left some small comments and a couple of suggestions for cleaner code.

This PR adds functionality to `DataHandler`, `TimeVaryingInputs`, and `SpaceVaryingInputs`
to enable composing multiple input variables into one data variable. To do this, the user
must specify a composing function, multiple variable names, and the file paths where they
can be read from.

Most of the changes have been made at the `DataHandler` level. Each input variable has its
own unique `FileReader` object, and each composed data variable has one `Time/SpaceVaryingInput`
and one `DataHandler`. The composing function itself is applied in the `regridded_snapshot`
function, just before regridding. The user will interact with this feature at the
`Time/SpaceVaryingInput` level.

This feature is only available when using `InterpolationsRegridder`, not `TempestRegridder`.

Design decisions made include:
- If a pre-processing function is provided, it is applied to each input variable before
  they are composed.
- Variables are composed before regridding, to preserve higher resolution information
- We assume that all input variables have the same temporal and spatial dimensions.
  This is explicitly checked in the `DataHandler` constructor, and will raise an
  informative error message if it is not true.
- Multiple input variables can come from one file, or each from their own unique file.
  We don't currently support arbitrary numbers of input variables and files, since this
  would require more work to implement and is not an expected use case in the near term.
@juliasloan25 juliasloan25 merged commit c9b7c0c into main Sep 9, 2024
19 checks passed
@juliasloan25 juliasloan25 deleted the js/compose-inputs branch September 9, 2024 18:28
@juliasloan25 juliasloan25 mentioned this pull request Oct 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support composing Time and Space VaryingInputs
2 participants