Skip to content

Consider adding grouped_okay parameter to is_epi_archive #649

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
brookslogan opened this issue Mar 31, 2025 · 3 comments
Open

Consider adding grouped_okay parameter to is_epi_archive #649

brookslogan opened this issue Mar 31, 2025 · 3 comments

Comments

@brookslogan
Copy link
Contributor

From #644 (comment)

Previously, we had is_epi_archive with a something like a grouped_okay arg deciding whether to count grouped archives or not. Does that sound useful?

@dshemetov
Copy link
Contributor

Alternatively, can we make grouped archives have both epi_archive and grouped_epi_archive like tibbles

r$> tibble(a = 1, b = 1) %>% group_by(a) %>% class()
[1] "grouped_df" "tbl_df"     "tbl"        "data.frame"

@brookslogan
Copy link
Contributor Author

brookslogan commented Apr 17, 2025

We/I originally decided against that because it makes grouped archives inherit ungrouped archive implementations, which can be error prone. Unless we & any other archive method implementers make grouped impls / groupedness checks for everything that at least abort, then a user may get an ungrouped op [or even worse, data structure corruption] expecting a grouped one. I ran into this at some point with grouped_df + add_{row,column} (though in its current state at least, there are checks / signs that might prevent this).

idea: something sort of similar that might be safer is having

  • c("grouped_epi_archive", "epi_archive")
  • c("ungrouped_epi_archive", "epi_archive")

Haven't really thought this through though. (Potentially related: it would be nice to have an interface / abstract class sort of thing for epi_archives that would allow multiple data structure implementations; e.g., the current diffing thing vs. the disk snapshot thing vs. duckdb vs. a lazy merge etc.)

[It also seems more that ungrouped archives could inherit in some sense from grouped ones, since they are like having a single group. But that might look confusing class-wise.]

@dshemetov
Copy link
Contributor

Could also put is_epi_archive and is_grouped_epi_archive on the same ref page, so the user can OR the two together if they want to allow both.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants