-
Notifications
You must be signed in to change notification settings - Fork 25.3k
Snapshots support multi-project #130000
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Snapshots support multi-project #130000
Conversation
…rvice-multi-project
…rvice-multi-project
…rvice-multi-project
Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination) |
Most of the changes are for |
* @param projectId The project that the repository belongs to | ||
* @param name Name of the repository | ||
*/ | ||
public record ProjectRepo(ProjectId projectId, String name) implements Writeable { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an existing class extracted from RepositoryOperation
.
final var projectMetadata = clusterMetadata.getProject(getProjectId()); | ||
executor.execute(ActionRunnable.run(allMetaListeners.acquire(), () -> { | ||
if (finalizeSnapshotContext.serializeProjectMetadata()) { | ||
PROJECT_METADATA_FORMAT.write(projectMetadata, blobContainer(), snapshotId.getUUID(), compress); | ||
} else { | ||
GLOBAL_METADATA_FORMAT.write(clusterMetadata, blobContainer(), snapshotId.getUUID(), compress); | ||
} | ||
})); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is where we conditionally write ProjectMetadata
for multi-project snapshots. The Metadata
in this case is a thin wrapper around ProjectMetadata
to reuse existing finalization related classes.
private void startExecutableClones(SnapshotsInProgress snapshotsInProgress) { | ||
for (List<SnapshotsInProgress.Entry> entries : snapshotsInProgress.entriesByRepo()) { | ||
startExecutableClones(entries); | ||
} | ||
} | ||
|
||
/** | ||
* Maybe kick off new shard clone operations for all repositories of the specified project | ||
*/ | ||
private void startExecutableClones(SnapshotsInProgress snapshotsInProgress, ProjectId projectId) { | ||
for (List<SnapshotsInProgress.Entry> entries : snapshotsInProgress.entriesByRepo(projectId)) { | ||
startExecutableClones(entries); | ||
} | ||
} | ||
|
||
/** | ||
* Maybe kick off new shard clone operations for the single specified project repository | ||
*/ | ||
private void startExecutableClones(SnapshotsInProgress snapshotsInProgress, ProjectRepo projectRepo) { | ||
startExecutableClones(snapshotsInProgress.forRepo(Objects.requireNonNull(projectRepo))); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Snapshotting is state machine that triggers next operation when the current operation finishes. In most cases, the triggering is confined in the same repository. This is the simplest case and gets migrated as is. The other case is triggering across all repositories. In a MP setup, this could mean either across all repositories of all projects or across all repositories of a single project. This is the reason for the 3 variants of the same named method here. The principles that I have applied are:
- If the scope was a single repository, keep it as is.
- If the scope was all repositories and reacting to cluster state changes, i.e.
applyClusterState
, it applies to all repositories across all projects. - If the scope was all repository and happening after completing a particular snapshot operation, e.g. deleting a snapshot entry, it applies to all repositories of a single project that the operation is associated with.
private static Tuple<ClusterState, List<SnapshotDeletionsInProgress.Entry>> readyDeletions( | ||
ClusterState currentState, | ||
@Nullable ProjectId projectId | ||
) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is another example of different scopes for triggering next operation. In this case, it does not have the single repository scope, but either cluster wide or single project (when projectId == null
).
Depending on how we handle project soft-deletion and/or clean-up, snapshots may see a project getting concurrently deleted and thereore fail. This PR does not attempt to handle it more gracefully since it would become noise and get burried in the large amount of namespacing changes. I can raise a separate ticket to track this work. |
…rvice-multi-project
yeah, we'd need a new ticket for this under the soft-deletion epic. We briefly mentioned in the design doc that once the project is marked for deletion, we should 1) prevent any new snapshots being scheduled/requested. This partially goes back to making those internal actions aware of checking for the deletion project block. 2) any ongoing snapshotting should be cancelled for that project (I guess not that simple but somehow at least fail graciously and not blow up). For 1, we have ES-12121. But I don't think 2 has a ticket yet. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. To my eyes these all look like straight-forward changes to trickle down project ID everywhere necessary. I don't have a strong opinions about the details. Although considering snapshotting code is convoluted and in a delicate state, I'm gonna defer the final approval to David (or anyone else with more snapshotting experience).
if (token == null) { | ||
token = parser.nextToken(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's this all about?
@@ -2014,7 +2028,7 @@ private void getOneSnapshotInfo(BlockingQueue<SnapshotId> queue, GetSnapshotInfo | |||
Exception failure = null; | |||
SnapshotInfo snapshotInfo = null; | |||
try { | |||
snapshotInfo = SNAPSHOT_FORMAT.read(metadata.name(), blobContainer(), snapshotId.getUUID(), namedXContentRegistry); | |||
snapshotInfo = SNAPSHOT_FORMAT.read(getProjectRepo(), blobContainer(), snapshotId.getUUID(), namedXContentRegistry); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is getProjectRepo()
now for non-MP always uding DEFAULT
as project name here?
This PR makes snapshot service code and APIs multi-project compatible.
Resolves: ES-10225
Resolves: ES-10226