Skip to content

Conversation

@vvzvlad
Copy link

@vvzvlad vvzvlad commented Oct 14, 2025

description: ScheduleNextCheck now adds random jitter of 0–10 minutes to the calculated interval and re-limits the value to the upper limits of the configuration. This reduces the likelihood of massive simultaneous requests to a single host after failed updates and helps bypass frequency limits.

testing: updated check_target_interval to allow for an upper limit of +10m jitter; local linters on changed files without errors.

breaking changes: none

reopen old PR #3821 (I don't know how to change the message of an old commit, and the checks are complaining about it.)

@fguillot
Copy link
Member

The current approach of adding jitter to the next_check_at field won't likely achieve the desired throttling because of how Miniflux handles feed processing.

The next_check_at field is only used to fetch a group of feeds from the database. Once this batch is retrieved, Miniflux immediately dispatches the entire list to a pool of goroutines via a Go channel, regardless of the next_check_at time. Consequently, many feeds will still be processed and refreshed in parallel at the same time.

There is an existing option POLLING_LIMIT_PER_HOST to prevent concurrent requests from overwhelming a single host. That should achieve the desired throttling.

In the example below, I have two feeds with a different next_check_at value:

miniflux2=# select feed_url, checked_at, next_check_at from feeds;
               feed_url                |          checked_at           |         next_check_at
---------------------------------------+-------------------------------+-------------------------------
 https://www.lemonde.fr/en/rss/une.xml | 2025-10-19 11:55:23.030696-07 | 2025-10-19 10:01:58.497486-07
 https://linuxfr.org/news.atom         | 2025-10-19 12:00:31.689737-07 | 2025-10-19 11:01:10.505342-07
(2 rows)

If I refresh all feeds, they will be refreshed at the same time in parallel regardless of the next scheduled date.

time=2025-10-19T12:01:09.390-07:00 level=INFO msg="Created a batch of feeds" batch_size=100 rows_count=2 skipped_feeds_count=0 jobs_count=2
time=2025-10-19T12:01:09.390-07:00 level=DEBUG msg="Feed URLs in this batch" feed_urls="[https://www.lemonde.fr/en/rss/une.xml https://linuxfr.org/news.atom]"
time=2025-10-19T12:01:09.390-07:00 level=INFO msg="Starting a pool of workers" nb_workers=16
time=2025-10-19T12:01:09.390-07:00 level=INFO msg="Refreshing feed" feed_id=1 user_id=1 worker_id=15
time=2025-10-19T12:01:09.390-07:00 level=DEBUG msg="Begin feed refresh process" user_id=1 feed_id=1 force_refresh=false
time=2025-10-19T12:01:09.390-07:00 level=INFO msg="Refreshing feed" feed_id=2 user_id=1 worker_id=7
time=2025-10-19T12:01:09.391-07:00 level=DEBUG msg="Begin feed refresh process" user_id=1 feed_id=2 force_refresh=false

The checked_at field has exactly the same timestamp:

miniflux2=# select feed_url, checked_at, next_check_at from feeds;
               feed_url                |          checked_at           |         next_check_at
---------------------------------------+-------------------------------+-------------------------------
 https://linuxfr.org/news.atom         | 2025-10-19 12:01:09.423861-07 | 2025-10-19 13:06:37.759041-07
 https://www.lemonde.fr/en/rss/une.xml | 2025-10-19 12:01:09.41023-07  | 2025-10-19 13:08:01.973148-07
(2 rows)

Now, if I add different feeds using the same host, and enable the option POLLING_LIMIT_PER_HOST=1, only one request will be made to same host:

time=2025-10-19T12:23:13.353-07:00 level=DEBUG msg="Feed host limit reached for this batch" feed_url="https://www.youtube.com/feeds/videos.xml?channel_id=UCG9G2dyRv04FDSH1FSYuLBg" feed_hostname=www.youtube.com limit_per_host=1 current=1
time=2025-10-19T12:23:13.353-07:00 level=INFO msg="Created a batch of feeds" batch_size=100 rows_count=4 skipped_feeds_count=1 jobs_count=3
time=2025-10-19T12:23:13.353-07:00 level=DEBUG msg="Feed URLs in this batch" feed_urls="[https://www.youtube.com/feeds/videos.xml?channel_id=UCIZFCxVs-h1nP6OzJwGeIRQ https://linuxfr.org/news.atom https://www.lemonde.fr/en/rss/une.xml]"
miniflux2=# select feed_url from feeds;
                                   feed_url
------------------------------------------------------------------------------
 https://www.youtube.com/feeds/videos.xml?channel_id=UCG9G2dyRv04FDSH1FSYuLBg
 https://www.lemonde.fr/en/rss/une.xml
 https://www.youtube.com/feeds/videos.xml?channel_id=UCIZFCxVs-h1nP6OzJwGeIRQ
 https://linuxfr.org/news.atom
(4 rows)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants