-
Notifications
You must be signed in to change notification settings - Fork 10
RFC 35: Easy hg/github → hg.mozilla.org syncing #35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
|
||
| # Details | ||
|
|
||
| Because this RFC is scoped to only support Phabricator-enabled repositories we can ignore some of the usual security concerns (because we won't have high-value credentials, just a Phabricator token). This also means we don't need to use something as complex as a scriptworker -- docker-worker should be perfectly fine for this use case. Most of the implemention work will be glueing existing things together. At a high level, we need a Task that: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we lose this token, then something or someone can create phab patches that look like valid sync task output, but contain something arbitrary or malicious. That doesn't necessarily mean we need to guard the token as tightly as release signing secrets, but we may want something to verify that the phab patch matches the vendored code it's supposed to come from.
We already have a workflow like this in updatebot; we may want to look at reusing some of the same workflows. Aiui, updatebot creates a vendored code patch in phab, someone reviews and merges that phab patch, and tests will run against that revision that verifies that if we rerun the automation to generate the patch from the original source repo and revision, we get the exact same patch.
Also, aiui updatebot is restricted to only change code inside of a vendored subtree, and is not able to change the code that verifies the vendored code.
However, I think there are still things we'd have to do in order to use updatebot, namely have a scripted process where updatebot can generate the patches and verify them, at least. I don't know enough about it to say how fully automated it is, and how much is tjr and company pushing it along. Worth looking into.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a really good point; I somehow managed to forget about updatebot when writing this 🙃
| * Unpacks it | ||
| * Clones the target repository | ||
| * Copies all or some of the contents to a target repository | ||
| * Runs `moz-phab` to create a Phabricator revision |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- tests may or may not run in the phab reviewbot run to verify the patch matches the original source tree and revision
- tests definitely do run on push to autoland that verify the patch matches the original source tree and revision
| * Copies all or some of the contents to a target repository | ||
| * Runs `moz-phab` to create a Phabricator revision | ||
|
|
||
| Choosing to take an artifact from an upstream Task rather than simply copy files from a source repository clone is notable, as it provides the flexibility for projects to run build or other preprocessing steps before syncing is completed (which is a known use case for both PDF.js and NSS). If uses cases for more simple copies present themselves in the future, we may add support for the simpler operation as well - but it is not necessary at this time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As long as this artifact is reproducible, and there's a good way for us to reproduce it in autoland/reviewbot, I'm ok with that. If it isn't reproducible, we may be looking at a CoT-like solution to verify the original artifact was built in trusted workers from a trusted repo and revision.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this really a requirement, or a nice to have? The alternative we're comparing to is people doing similar things on their laptops.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if we're doing it properly, not allowing for a single bad actor to pwn m-c is preferred.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if we do updatebot or similar from the gecko trust domain, cloning the source repo and running a script, then we can use CoT to verify the artifact is from a trusted tree/worker, and we can worry a bit less about reproducibility.
| # Open Questions | ||
|
|
||
| * Does this enable single actor attacks? Ie: a privileged person in the PDF.js repo gets automation to create a Phab revision, and then reviews it themselves and lands it? | ||
| * I think people can already land without review, so maybe it's effectively no different? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so, as written now, before my comments:
- someone extracts the phab token, somehow
- someone creates a phab patch that looks like a valid phab patch, but contains malicious contents
- waits for someone to review it and lando, or they review it and lando themselves
- no tests run, and we've pwned central
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think people can already land without review, so maybe it's effectively no different?
If they do so, it probably looks like they landed their own patch with an r=me and that patch is likely worth some inspection by other developers.
If they follow this approach, it looks like they reviewed a valid vendoring patch, and these patches can be large enough that finding the badness can be a needle in a haystack without the verification task.
Also, people can land without review if they're level 4, which is a restricted list. A leaked phab token could essentially open this up to all level 3 users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a good point that the look of someone landing their own patch is very different here. I definitely agree that it's less likely to be noticed at something alarming.
escapewindow
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Marking request changes because I think we can't proceed without changes.
| * Unpacks it | ||
| * Clones the target repository | ||
| * Copies all or some of the contents to a target repository | ||
| * Runs `moz-phab` to create a Phabricator revision |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A key safety feature here is that none of the just-modified code can be executed during the landing process. It's not clear to me if moz-phab enforces that rule.
Related, I'd like to see some hard coding as to what parts of the repository are allowed to be updated via this process. This type of process should not be able to update any of the build code (including taskgraphs), nor modify any file that will be passed to eval() type functions.
If these precautions are possible, please doc as requirements. If not possible - how else can we protect against that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This type of process should not be able to update any of the build code (including taskgraphs), nor modify any file that will be passed to eval() type functions.
I'm not sure this is practical, if I'm understanding correctly. As an example, NSS imports into mozilla-central, including files like https://hg.mozilla.org/mozilla-central/file/tip/security/nss/build.sh, which can obviously eval and run arbitrary code. Is this something you would want to be denied landing through this system?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this something you would want to be denied landing through this system?
Yes, unless we can think of sufficient "other layers" to mitigate the risk. Combined with the SaC & approve-own-patch issue, I'm feeling less comfortable. :/
You may want to ask for an early RRA/consultation -- I'm unsure how the "risks of new approach" compare to "risks of current system"
|
|
||
| * Does this enable single actor attacks? Ie: a privileged person in the PDF.js repo gets automation to create a Phab revision, and then reviews it themselves and lands it? | ||
| * I think people can already land without review, so maybe it's effectively no different? | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How will use of this feature be restricted? It's one thing to have it used in vetted situations, it's quite another to allow general usage. The "only in this subtree" restriction suggested above is one level of protection, but are any more planned?
I'm assuming this would only run on level 3 builders "for real", but I don't know how that impacts testing, etc.
rendered