Skip to content

Race condition can caused missed records syncing incremental updates #135

@positron

Description

@positron

update_all will skip up to 1 seconds of updates.

https://github.com/peeringdb/peeringdb-py/blob/master/src/peeringdb/_update.py#L237-L251

last_change calculates _since by fetching the most recent timestamp for updated or created. It then fetches any changes from peeringdb using _since + 1.

The peeringdb API uses this query:

# in src/peeringdb_server/rest.py
since = int(float(self.request.query_params.get("since", 0)))

# then in django_handleref manager.py
qset = qset.filter(
    models.Q(created__gt=timestamp) | models.Q(updated__gt=timestamp)
)

So if I make the update_all http request 2 milliseconds after midnight, 00:00.002 and there was an object that was updated 1 second after midnight at 00:00.001 it will be stored with an updated timestamp of 00:00 in my local mirror, since there is only one second precision in the API response.

Next time I call update_all I will get that object in my database with an updated column of 00:00. So _since + 1 will be 1 whole second after midnight, 00:01, and update_all will send a request with a timestamp query parameter of 1 second after midnight.

The peeringdb API only returns data with an updated column GREATER than 1 second after midnight. So I'll miss any object created in the last part of that first second. For example, if another object was created 3 milliseconds after midnight.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions