-
Notifications
You must be signed in to change notification settings - Fork 10
Release for Mozilla: Shared Trust Domain & Workers #33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
c1c9b7e
ce3d1ca
a722599
b1ff95c
de2a442
5c21de6
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,78 @@ | ||
| # RFC 33 - Release for Mozilla: Shared Trust Domain & Workers | ||
| * Comments: [#33](https://github.com/mozilla-releng/releng-rfcs/pull/33) | ||
| * Proposed by: @bhearsum | ||
|
|
||
| # Summary | ||
|
|
||
| Build and maintain a shared Trust Domain, Workers, and Scriptworkers on the Firefox CI cluster that any Mozilla project can use. Browser products will remain in their existing - separate - trust domain. | ||
|
|
||
| ## Motivation | ||
|
|
||
| One of the barriers to entry for using Taskcluster is waiting on RelEng to create and deploy a new Trust Domain and Worker for a new project. Even when this takes less than a day to do (and it often takes longer), it's still something that needs to be waited on, and is slower than using CircleCI or Github Actions. | ||
|
|
||
| # Details | ||
|
|
||
| We will create a new trust domain and workers that are generally available for Mozilla employees and trusted volunteers to use. Specifically: | ||
|
|
||
| * A new Trust Domain (`mozilla`) that is not tied to a specific project or product | ||
| * New Workers for builds on Linux, macOS 10.15, and Windows Server 2012 | ||
| * A RelEng maintained Docker image will be provided for Linux | ||
| * These will be created under a new `mozilla-1` provisioner | ||
| * New Workers for tests on Linux, macOS 10.15, and Windows 10 | ||
| * A RelEng maintained Docker image will be provided for Linux | ||
| * These will be created under a new `mozilla-t` provisioner | ||
| * New Scriptworker instances for signing and mac-signing | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm a bit on the fence about this. On the one hand, I realize that the intent here is to make it easier for us to stand things up, especially for things like mac signers. On the other, maybe when a project reaches the point of needing scriptworkers, then that is a signal that it is time for them to graduate to their own trust domain.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also, if we're only supporting L1 scriptworkers on this pool, then is it really saving us any work? We'll still need to wrangle mac signers for the L3 pool anyway.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this loses a fair amount of value if we can't have signing as part of this. We'd have to do one of the following pretty quickly:
A potentially crazier idea could be to allow most tasks to run in the shared pool, but somehow put signing in its own pool (which means tasks in the same graph would be have two trust domains...which would probably means CoT changes). This would at least mean that signing would be purely adding to existing things, rather than getting blocked on a migration to a different pool. |
||
| * These will be created under the existing `scriptworker-k8s` and `scriptworker-prov-v1` provisioners | ||
| * Workers will be prefixed with `mozilla-t-` | ||
| * mac-signing will run 10.14, like our other mac-signing workers (there's no known reason to upgrade) | ||
|
|
||
| Notably, we are only concerned with level 1 workers at this time, which means we can ignore things like scriptworkers that are only used when shipping. Level 3 workers will be dealt with at a later stage, and most likely will not use a shared trust domain or workers across projects. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As a sort of follow-on to my previous comment, what if instead of disallowing L3 on the shared pools, we instead disallow scriptworkers? Then once the project is a bit more mature, we would graduate them to their own trust domain and add the scriptworker tasks. This way they could get started with Taskcluster right away, and have a proper PR == L1 and push == L3 setup. Security wise, the setup wouldn't be ideal.. but it would still be better than GH Actions, or doing signing on personal dev machines. It kind of reminds me of the autonomous driving quote. It doesn't have to be perfect, it just has to be better than manual.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I think this is a great idea, independent of what we do with any other part of this! I can't think of any downsides to separating PRs and pushes. |
||
|
|
||
| Access to create and manage tasks on these new workers will be granted to anyone with `scm_level_1_github` or `scm_level_1`. | ||
|
|
||
| Going forward, we will ensure workers for other supported build or target platforms are added to this pool. (For example, when we add support for scheduling iOS tests in Taskcluster, that will be made available in the `mozilla-t` provisioner as well.) | ||
|
|
||
| ## List of Pools | ||
|
|
||
| Pool ID | Purpose | ||
| ==================================================================================== | ||
| mozilla-1/linux | Linux jobs | ||
| mozilla-1/linux-highcpu | Linux jobs requiring more CPU resources | ||
| mozilla-1/win2012 | Windows Server 2012 jobs | ||
| mozilla-1/win2012-highcpu | Windows Server 2012 jobs requiring more CPU resources | ||
| mozilla-1/win10 | Windows 10 jobs | ||
| mozilla-1/macos-bigsur | macOS 10.15 jobs | ||
| mozilla-t/signing | Non-mac signing jobs | ||
| mozilla-t/mac-signing | Mac signing jobs | ||
|
|
||
| ## Hardware Machine Allocation | ||
|
|
||
| We will need hardware for 3 different pools, which will be allocated as noted below: | ||
| * 2 machines for macOS signing, running allocated from the existing production Firefox pool | ||
| * 3 machines for macOS builds, allocated from TBD | ||
| * 3 machines for macOS tests, allocated from TBD | ||
|
|
||
| When additional workers are needed in the future, they will be allocated from TBD. | ||
|
|
||
| ## `v3` Taskcluster index format | ||
|
|
||
| The current `v2` index format only includes repository names as an identifier (not user or organization). This is generally not an issue for any of current trust domains (because they generally only support one project), but in this new pool where we have N projects, it introduces the potential for collisions or pollution between them. To ensure this isn't an issue we will introduce a new `v3` index format that includes the repository location in its path as well - including both domain and path. Examples include: | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The mobile repos define a This is then used in the index route. I think we could simply define this as
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sounds fine to me - good idea! |
||
| * `index.mozilla.v3.github.com.mozilla-mobile.mozilla-vpn-client.branch.main.latest.taskgraph.decision` | ||
| * `index.gecko.v3.hg.mozilla.org.releases.mozilla-beta.latest.firefox.decision` | ||
|
|
||
| This will require changes to a few things to support the new format: | ||
| * build-decision | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What build-decision changes were you anticipating? I'd guess any projects using this will be on Github.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I...I have no idea at this point. Maybe I was thinking that some new projects might not be on Git. We can probably remove/ignore this. |
||
| * scriptworker | ||
| * taskgraph | ||
|
|
||
| Existing users of Taskcluster will not be required to upgrade to the v3 format. | ||
|
|
||
| # Open Questions | ||
|
|
||
| # Implementation | ||
|
|
||
| <once the RFC is decided, these links will provide readers a way to track the | ||
| implementation through to completion> | ||
|
|
||
| * <link to tracker bug, issue, etc.> | ||
| * <...> | ||
Uh oh!
There was an error while loading. Please reload this page.