-
Notifications
You must be signed in to change notification settings - Fork 476
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Timeline::set_state
is still racy
#10297
Labels
c/storage/pageserver
Component: storage: pageserver
t/bug
Issue Type: Bug
triaged
bugs that were already triaged
Comments
This was referenced Jan 7, 2025
Closed
github-merge-queue bot
pushed a commit
that referenced
this issue
Jan 9, 2025
## Problem Auto-offloading as requested by the compaction task is racy with unarchival, in that the compaction task might attempt to offload an unarchived timeline. By that point it will already have set the timeline to the `Stopping` state however, which makes it unusable for any purpose. For example: 1. compaction task decides to offload timeline 2. timeline gets unarchived 3. `offload_timeline` gets called by compaction task * sets timeline's state to `Stopping` * realizes that the timeline can't be unarchived, errors out 6. endpoint can't be started as the timeline is `Stopping` and thus 'can't be found'. A future iteration of the compaction task can't "heal" this state either as the timeline will still not be archived, same goes for other automatic stuff. The only way to heal this is a tenant detach+attach, or alternatively a pageserver restart. Furthermore, the compaction task is especially amenable for such races as it first stores `can_offload` into a variable, figures out whether compaction is needed (which takes some time), and only then does it attempt an offload operation: the time difference between "check" and "use" is non-trivially small. To make it even worse, we start the compaction task right after attach of a tenant, and it is a common pattern by pageserver users to attach a tenant to then immediately unarchive a timeline, so that an endpoint can be started. ## Solutions not adopted The simplest solution is to move the `can_offload` check to right before attempting of the offload. But this is not a good solution, as no lock is held between that check and timeline shutdown. So races would still be possible, just become less likely. I explored using the timeline state for this, as in adding an additional enum variant. But `Timeline::set_state` is racy (#10297). ## Adopted solution We use the lock on the timeline's upload queue as an arbiter: either unarchival gets to it first and sours the state for auto-offloading, or auto-offloading shuts it down, which stops any parallel unarchival in its tracks. The key part is not releasing the upload queue's lock between the check whether the timeline is archived or not, and shutting it down (the actual implementation only sets `shutting_down` but it has the same effect on `initialized_mut()` as a full shutdown). The rest of the patch is stuff that follows from this. We also move the part where we set the state to `Stopping` to after that arbiter has decided the fate of the timeline. For deletions, we do keep it inside `DeleteTimelineFlow::prepare` however, so that it is called with all of the the timelines locks held that the function allocates (timelines lock most importantly). This is only a precautionary measure however, as I didn't want to analyze deletion related code for possible races. ## Future changes It might make sense to move `can_offload` to right before the offload attempt. Maybe some other properties might have changed as well. Although this will not be perfect either as no lock is held. I want to keep it out of this change to emphasize that this move wasn't the main reason we are race free now. Fixes #10220
Triage notes:
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
c/storage/pageserver
Component: storage: pageserver
t/bug
Issue Type: Bug
triaged
bugs that were already triaged
The
set_state
function ofTimeline
forbids some state transitions like from Stopping -> Active, or Broken -> any other state. However,if the function gets called concurrently, it can happen that both threads call
current_state
before the other thread had the opportunity to write. Then we might have forbidden state transitions.Say a timeline is
Loading
, and thread A wants to set it toStopping
. Thread B wants to set it toActive
. Then, if both threads readcurrent_state()
at the same time, they might both callsend_replace
. It can happen that thread A gets through first, so the timeline is firstStopping
. Then, thread B changes the state toActive
, causing timeline state to go fromStopping
toActive
, something that we don't want to allow.Earlier issue: #2727
The text was updated successfully, but these errors were encountered: