Skip to content
This repository has been archived by the owner on Dec 10, 2024. It is now read-only.

Idea for managed pagination into channels for resources #1496

Closed
wants to merge 3 commits into from

Conversation

catouc
Copy link

@catouc catouc commented Jun 9, 2022

For some context I maintain a package that does this already inside of the company I work for. The semantics have proven pretty useful and seem to integrate well in this library so I thought upstreaming this could be useful to the wider community.

I have written up a sample implementation for the projects API but this can essentially be repeated for every paginated resource.
Ultimately we can also think about implementing !815 behind the scenes in this down the line 🤔

But before I go implement loads of endpoints I thought it's worth checking if this is desired to have this or if this should be a separate more specialised package?

Happy to have a discussion!

@catouc catouc force-pushed the streaming-paginated-resources branch from 6dd7d4c to ac1be72 Compare June 9, 2022 19:11
@catouc catouc force-pushed the streaming-paginated-resources branch from ac1be72 to 9c8af6e Compare June 11, 2022 09:50
@svanharmelen
Copy link
Member

@catouc thanks for your PR, yet I need to think a little bit about this one. Until now this package tries to follow the GitLab API as close as possible, but this PR introduces a custom helper function that doesn't exist as a GitLab API. So I'm not sure yet if I want to open the doors for such functions...

@catouc
Copy link
Author

catouc commented Jun 13, 2022

I had an idea that we could put this into a sub-package called streaming or something? Then it's logically separated more from the core library 🤔

timofurrer added a commit to timofurrer/go-gitlab that referenced this pull request Jun 13, 2022
The xanzy#1496 Pull Request triggered
some thinking for how to do this in a more "generic" way.

Thus, I drafted this `gitlab.Collect()` function, which can be used to
"easily" collect all items of a specific "list" API (I've implement a
*collect* function which eagerly collects all items, but something
similar can be implemented for channels to stream).

The only downside of the current implementation so far is that Go
doesn't really support generics / interfaces with direct field access as
of now ... effectively, you'll always need methods.
This means that, because the `ListOptions` field is somewhat
inconsistently (sometimes as subfield, sometimes it's promoted)
implemented over this package every `gitlab.List*Options` struct needs
it's own `GetPage() int` and `SetPage(int)` implementation ...

One improvement for this could be that we split out the `ListOptions`
from the API endpoint specific options and make it an additional
argument to the list methods - so that `Collect()` could profit from
this and `GetPage() int` and `SetPage(int)` would only need to be
implemented once.

Another one would be to just use `gitlab.ListOptions` for endpoints
which don't have additional parameters instead of doing this `type
ListInstanceVariablesOptions ListOptions`.

However, I think best would be to separate the generic `ListOptions` to
a separate argument in the long term ...
timofurrer added a commit to timofurrer/go-gitlab that referenced this pull request Jun 13, 2022
The xanzy#1496 Pull Request triggered
some thinking for how to do this in a more "generic" way.

Thus, I drafted this `gitlab.Collect()` function, which can be used to
"easily" collect all items of a specific "list" API (I've implement a
*collect* function which eagerly collects all items, but something
similar can be implemented for channels to stream).

The only downside of the current implementation so far is that Go
doesn't really support generics / interfaces with direct field access as
of now ... effectively, you'll always need methods.
This means that, because the `ListOptions` field is somewhat
inconsistently (sometimes as subfield, sometimes it's promoted)
implemented over this package every `gitlab.List*Options` struct needs
it's own `GetPage() int` and `SetPage(int)` implementation ...

One improvement for this could be that we split out the `ListOptions`
from the API endpoint specific options and make it an additional
argument to the list methods - so that `Collect()` could profit from
this and `GetPage() int` and `SetPage(int)` would only need to be
implemented once.

Another one would be to just use `gitlab.ListOptions` for endpoints
which don't have additional parameters instead of doing this `type
ListInstanceVariablesOptions ListOptions`.

However, I think best would be to separate the generic `ListOptions` to
a separate argument in the long term ...
@timofurrer
Copy link
Contributor

@svanharmelen @catouc I've done some small experiment to make this somewhat sustainable, see #1498

timofurrer added a commit to timofurrer/go-gitlab that referenced this pull request Jun 13, 2022
The xanzy#1496 Pull Request triggered
some thinking for how to do this in a more "generic" way.

Thus, I drafted this `gitlab.Collect()` function, which can be used to
"easily" collect all items of a specific "list" API (I've implement a
*collect* function which eagerly collects all items, but something
similar can be implemented for channels to stream).

The only downside of the current implementation so far is that Go
doesn't really support generics / interfaces with direct field access as
of now ... effectively, you'll always need methods.
This means that, because the `ListOptions` field is somewhat
inconsistently (sometimes as subfield, sometimes it's promoted)
implemented over this package every `gitlab.List*Options` struct needs
it's own `GetPage() int` and `SetPage(int)` implementation ...

One improvement for this could be that we split out the `ListOptions`
from the API endpoint specific options and make it an additional
argument to the list methods - so that `Collect()` could profit from
this and `GetPage() int` and `SetPage(int)` would only need to be
implemented once.

Another one would be to just use `gitlab.ListOptions` for endpoints
which don't have additional parameters instead of doing this `type
ListInstanceVariablesOptions ListOptions`.

However, I think best would be to separate the generic `ListOptions` to
a separate argument in the long term ...
@catouc
Copy link
Author

catouc commented Jun 13, 2022

I maintain something similar in my internal library which has a function called getResourcePage that basically just accepts the resource URI, that made it somewhat sustainable without generics, of course still a lot of duplication since every List call gets its own wrapper, but I'm not sure if that cost is much different from the current cost of calls 🤔
Overall this looks pretty neat as an idea I think. Though the interface gets a bit confusing from the user end right now I feel and if we wrap this implementation with specific calls again, how much have we gained?

But yeah the essentially benefit of this is that you can use channels because it allows you to work page by page on what you need to do, which starts to matter once you have 10k+ resources and need to do stuff across all of them.

@catouc
Copy link
Author

catouc commented Jun 13, 2022

One passing thought, it's probably worth looking at making the transition over to keyset pagination as smooth as possible, which does not deal with page numbers and offsets the same way.

I have a bare bones implementation (link header stuff) that's been working pretty okay so far but most of the client call need to be able to call full URLs because that's what the keyset pagination Next headers return, thought I'd drop that here additionally while I still remember 😄

@svanharmelen
Copy link
Member

I think I'm going for PR #1875 for adding a form of pagination support. If you think that solution can or should be changed/updated (including naming or ergonomics) than please comment on that PR so we can get to some kind of consensus before merging that one 😏

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants