Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: Improve supply chain security for WASM binary #48

Closed

Conversation

bcheidemann
Copy link

Addresses #17

Overview

This PR refactors the build and deployment pipeline with the aim of mitigating supply chain security concerns relating the the version control / review-ability of the WASM binary.

Motivation

In the current build and deployment pipeline, the SWC source code and compiled WASM are both committed. The release workflow implicitly trusts the WASM binary, without verifying that it is the output of the SWC source code. The compiled WASM code is difficult to review or verify, particularly without re-building from source, which would be somewhat redundant.

As such, a malicious but trusted maintainer (vis a vis the recent XZ exploit) could commit arbitrary WASM code during SWC version upgrades. The malicious WASM can inject arbitrary code into the compiled output, which cannot be detected automatically.

In part because amaro will be / is included as a component of the NodeJS project, the impact of a malicious WASM binary making it into amaro could be wide reaching and severe. This would likely make it a high value target for future social engineering exploits similar to XZ.

Existing Build / Release Workflow

flowchart TD
    A[Copy SWC Source into Repo] -->|commit changes| B
    B[Build WASM Binary] -->|commit changes| C
    C[Raise PR] -->|merge PR| D
    D[Bundle Code] --> E
    E[Release to NPM]
Loading

Proposed Solution

Since neither the WASM binary, nor the SWC source code are "owned" by this repo, they do not necessarily need to be version controlled in this repo. This is especially true of the WASM binary, which should be treated as a build output, not as version controlled code. The upstream SWC source code does need to be version controlled, but not necessarily in this repo.

The proposed solution comes in two independent parts.

1. Use submodules for SWC code

flowchart TD
    A[Copy SWC Source into Repo] -->|commit changes| B
    A2[Checkout Upstream SWC Tag] -->|commit changes| B
    B[Build WASM Binary] -->|commit changes| C
    C[Raise PR] -->|merge PR| D
    D[Bundle Code] --> E
    E[Release to NPM]

style A fill:#f99

style A2 fill:#9f9
Loading

Instead of copying the SWC code into the repo and version controlling it, we can use git submodules. This somewhat simplifies our update script, and ensures that the SWC code used to compile the binary can't deviate from that in the upstream SWC repo. In my view, this doesn't have a huge impact on supply chain security, and it does have some tradeoffs (discussed below).

2. Build from source when publishing

flowchart TD
    A[Copy SWC Source into Repo] -->|commit changes| B
    B[Build WASM Binary] -->|commit changes| C
    C[Raise PR] -->|merge PR| D
    C[Raise PR] -->|merge PR| A3
    A3[Build WASM from Source] --> D
    D[Bundle Code] --> E
    E[Release to NPM]

style B fill:#f99

style A3 fill:#9f9
Loading

Instead of committing the built WASM binary to the repo, it can be built from source when publishing. This means that the built WASM code doesn't need to be verified or reviewed (beyond running automated tests against it). This essentially makes it impossible to publish a modified WASM binary, without changes to the publish workflow, which would hopefully be picked up on review.

1 + 2

Overall, the changes would be something like this:

flowchart TD
    A[Copy SWC Source into Repo] -->|commit changes| B
    B[Build WASM Binary] -->|commit changes| C
    A2[Checkout Upstream SWC Tag] -->|commit changes| C
    C[Raise PR] -->|merge PR| D
    C[Raise PR] -->|merge PR| A3
    A3[Build WASM from Source] --> D
    D[Bundle Code] --> E
    E[Release to NPM]

style A fill:#f99
style B fill:#f99

style A2 fill:#9f9
style A3 fill:#9f9
Loading

And the final workflow would look like this:

flowchart TD
    A2[Checkout Upstream SWC Tag] -->|commit changes| C
    C[Raise PR] -->|merge PR| A3
    A3[Build WASM from Source] --> D
    D[Bundle Code] --> E
    E[Release to NPM]
Loading

Tradeoffs

Use of submodules

In the current iteration of this PR, we switched from committing SWC source code directly to the repo to using submodules. This has some implications which in aggregate are not overall clear cut good or bad.

One possible issue is on the impact to reviewing upstream SWC code changes. Currently, changes can be easily reviewed in version upgrades, as they are visible in the PR. Using submodules, the changes are still visible (at least in GitHub), but because git doesn't support "narrow clones", the submodule approach means we will see all SWC changes in upgrade PRs (not just those affecting the binding_typescript_wasm crate). This means reviewers need to know which parts of the diff can be safely ignored.

On the other hand, if we commit the parts of SWC we want to the repo, this leaves the door open for subtle modifications to the SWC code, which may be missed on review. Arguably, this is also true of the upstream SWC repo, but when using submodules we benefit from a double review (once in the upstream SWC repo and once in Amaro).

Another impact of using submodules is that we cannot easily "hotfix" SWC. I don't believe hotfixing SWC code in Amaro would (almost) ever be desirable, as they would not benefit from the review of the SWC maintainers, and may be implicitly trusted by reviewers in Amaro. None the less, this is a limitation of submodules which would need to be accepted.


I'd love some feedback on this, but please note the changes are currently only partly tested due to the limitations of my ✈️ WiFi.

@marco-ippolito
Copy link
Member

Can you rebase please, it seems like there are a lot of conflicts

@marco-ippolito
Copy link
Member

marco-ippolito commented Aug 12, 2024

now that versions update are automated by bots, how can someone commit untrusted code? also we use cargo install --locked and checked with cargo-deny

@bcheidemann bcheidemann marked this pull request as ready for review August 12, 2024 13:48
@bcheidemann
Copy link
Author

now that versions update are automated by bots, how can someone commit untrusted code? also we use cargo install --locked and checked with cargo-deny

Not 100% sure I follow, sorry! Are you talking about the upstream nodejs:main branch?

If so, I see that there is a GitHub actions workflow to generate a PR to update SWC and the WASM binary. I think this is preferable to manually updating SWC, but I feel that we can do better.

With the proposed changes (this PR), a malicious change would need to be committed as Rust source code to SWC, pass review from the SWC maintainers, and then pass a second review in Amaro, before finally being published.

Currently, since use of the current GHA workflow is not (automatically) enforced, someone could simply raise an update PR manually and include a malicious binary. Hopefully this would be spotted and rejected, but it could be missed.

@marco-ippolito marco-ippolito added the stale PRs or issues that have been stale label Dec 16, 2024
@marco-ippolito
Copy link
Member

We have created a standard docker image where we build the wasm and monitor changes with cargo deny.
I think this is enough for our used case since swc also tests for supply chain attacks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale PRs or issues that have been stale
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants