Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: User-provided hash #3349

Open
MatrixManAtYrService opened this issue May 10, 2024 · 11 comments
Open

Feature Request: User-provided hash #3349

MatrixManAtYrService opened this issue May 10, 2024 · 11 comments
Labels
feature requests I want a new feature in nvm!

Comments

@MatrixManAtYrService
Copy link

MatrixManAtYrService commented May 10, 2024

I see that nvm checks nodejs versions against a copy of SHASUMS256.txt which it downloads from the same mirror that it downloads nodejs.

This verification is not without value as-is, but I've got my tin-foil-hat on and it doesn't quite scratch the itch. I'd like to hard-code a hash so that my automation will break if there's a MITM between myself in the mirror (otherwise the MITM can just tamper with SHASUMS256.txt to make the verification pass and hide whatever skulduggery they've amended node with).

I'm imagining something like:

nvm install 16.19.1 --sha256 ca63da538e02de15b7e974f7a17ce4732cc0d63023942301d30044c472ed9ddd

Please consider it. Thank you.

@ljharb
Copy link
Member

ljharb commented May 10, 2024

Where are you getting the hash from in the first place if you can't trust nodejs.org?

@ljharb ljharb added the feature requests I want a new feature in nvm! label May 10, 2024
@MatrixManAtYrService
Copy link
Author

MatrixManAtYrService commented Jul 22, 2024

nodejs.org, probably. TOFU (trust on first use) is imperfect, but it's better than not having a hash at all.

What hard-coding the hash buys me is that in order to remain undetected, an attacker would then need to MITM me (and everyone else running my code, and my CI server) consistently, every single time, without changing the payload.

If the hash does not appear in my code, an attacker can (for instance) focus on only CI. They can inject their malicious version of nodejs once, wait for it to phone home, and then once they have established that the attack is viable they can iterate on it... each time injecting the newest version of the malicious payload. They can change it and remain undetected, so long as they don't break it.

I don't know how many threat actors exist which can pull off this kind of attack at all, probably some, but even fewer are those that can pull it off consistently enough to fool every target every time, which would be necessary to remain hidden in the case where the hard-coded hash was actually the malicious one.


I ended up just getting node from `https://nodejs.org/dist/v16.13.2/node-v16.13.2-darwin-arm64.tar.gz" and hard coding its hash like so. If you suspect that nobody beside me cares, it might be sensible to close this. I'd love it if my paranoia turned out to be unfounded.

@ljharb
Copy link
Member

ljharb commented Jul 22, 2024

I mean, nvm could certainly start caching the hashes, and report when they change - but sometimes they DO change, for legitimate reasons.

The purpose of checking the hash is to ensure the download isn't corrupted - it doesn't protect you against MITM, that's what SSL is for.

@MatrixManAtYrService
Copy link
Author

MatrixManAtYrService commented Jul 23, 2024

The purpose of hard coding a hash is to ensure that if/when an input changes, you're made aware of it because the hash check fails. Forget the security stuff for a moment, that kind of awareness is useful for puzzling out causes and effects. If something suddenly breaks, and only one of its inputs has changed since the last time it succeeded, then you know what to scrutinize.

I'm aware that this degree of scrutiny is not customary for the nodejs world, which likes to resolve things at runtime and not build time, I just sometimes wish it was 😅.

@ljharb
Copy link
Member

ljharb commented Jul 24, 2024

Do you think a hardcoded sha on the command line is necessary? or would it work if nvm cached the sha for a version on first sight - and then refused to install if the sha differed, but then telling you how to manually override and accept the new sha?

@MatrixManAtYrService
Copy link
Author

MatrixManAtYrService commented Jul 26, 2024

The scenario which prompted this issue is that I encountered a failure in CI where we had piped from data from http to bash, but bash was failing because it was attempting to run an http error response as if it was a shell script. The thought was:

I'm glad that this merely broke, but if we're trying to run an error message this means we'd also try to run... anything else that we got back from this request. Do any of our customers have enemies that could tamper with such a thing? Would we even notice if they did? 😨

My favored solution for this is nix, which collects hashes for all of the project dependencies into a flake.lock file so that I only have to worry about dependency changes when that file changes, but my team is not bought-in on nix, so the compromise is something like this:

- http get https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.7/install.sh | bash
+ http get https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.7/install.sh | save nvm.sh
+ # exit nonzero if hash doesn't match
+ cat nvm.sh | hash sha256 | grep 8e45fa547f428e9196a5613efad3bfa4d4608b74ca870f930090598f5af5f643 
+ cat nvm.sh | bash

I'm afraid that the hardcoded sha can't be dispensed with entirely, because I still have to worry about whether nvm itself has changed.

But the root of this issue is that the above approach is no good for nvm anyhow because even though I may have guaranteed that I'm getting an unchanged nvm.sh, nvm itself doesn't ensure that I'm getting an unchanged nodejs. So from that perspective, yeah, a cached sha would help.

I could imagine doing something like this on my dev machine:

$ nvm lock --arch linux-x64
  detected nodejs v16.13.2
  downloading https://nodejs.org/dist/v16.13.2/node-v16.13.2-linux-x64.tar.gz
  got sha256:a0f23911d5d9c371e95ad19e4e538d19bffc0965700f187840eb39a91b0c3fb0
  pinned nodejs v16.13.2 on linux-x64 via /Users/matt/src/myproject/.nvm.lock

And then this might happen in CI

$ nvm use 16 --check-lock
  detected system: linux-x64
  consulting /home/circleci/project/.nvm.lock ... found.  Using nodejs v16.13.2
  downloading https://nodejs.org/dist/v16.13.2/node-v16.13.2-linux-x64.tar.gz
  verifying... verified by .nvm.lock

I feel a little silly troubling you with this, especially since I've already moved on from using nvm anyhow. Your time is precious, thanks for taking some of it to listen to me.

@ljharb
Copy link
Member

ljharb commented Jul 26, 2024

For a hash of nvm itself, it's fetched from github, you can fetch it by git SHA, and github is using SSL cert pinning, so github itself would have to be compromised for this to be a concern worth spending even 10 minutes of time solving.

So, it sounds like the reasonable path left would be for nvm to auto-cache SHAs from nodejs.org's index.tab on first sight, and if any sha has changed, to fail loudly, and tell the user to run a command (like nvm update-hash $version $newSHA or something) which would update the cached hash?

@MatrixManAtYrService
Copy link
Author

MatrixManAtYrService commented Jul 29, 2024

What you're describing would scratch my itch 😄

In order to benefit from this in CI, users would have to teach CI about the cache file. Proceeding without thinking about it would lead to a case where the cache was empty every time. But I think that might be a necessary evil in this case.

Another potential misstep: a user commits their cache file, generated on one system architecture, but their CI uses a different architecture. They don't see the loud failure, so they assume that the file has not changed, but actually it was a cache-miss due to the architecture mismatch, and the file has changed. To avoid this I'd suggest adding a cli arg or an env var which would disable quietly updating the cache on a miss.

@ljharb
Copy link
Member

ljharb commented Jul 29, 2024

i'm confused - if a hash isn't in the file, then it doesn't matter whether it changed or not. are you saying that you want an option to fail loudly if you haven't pre-cached the hash?

@MatrixManAtYrService
Copy link
Author

MatrixManAtYrService commented Aug 19, 2024

Yes that's what I'm suggesting. It would be analogous to how npm ci fails loudly instead of depending on uncertain packages. Although instead of an integrity hash for the packages, we'd be requiring a hash for nodejs itself.

The goal would be to avoid cases where a user is unaware that a difference in the nodejs artifact has subsequently caused a difference in behavior. Without a --fail-on-unknown-package or somesuch, it would be easy to glance at the code and believe that you're operating from fixed inputs, when actually each run is a cache miss (maybe because the CI runner is using a different architecture than you were when you committed the cached file from your dev machine).

In such a case It's easy to believe that things are deterministic, when in fact you're just getting lucky because the package source happens to be unchanging at this time. If this happens, you're in for a surprise later when the package either changes authentically and breaks things by accident (more likely), or when you get MITM'd and now it contains malware (less likely). In either case, the problem is easier to identify if it fails earlier in the chain.

@PAStheLoD
Copy link

To add one more "datapoint" (or narrative), the ideal workflow would be something that Renovate (or DependaBot) can manage (and create MRs/PRs for the update). So it should go into .nvmrc.

corepack already puts a hash into the pacakgeManager field, nvm could do something similar.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature requests I want a new feature in nvm!
Projects
None yet
Development

No branches or pull requests

4 participants
@ljharb @PAStheLoD @MatrixManAtYrService and others