Skip to content

Include checksum of dependency in dub.selections.json #1723

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
no-longer-on-githu-b opened this issue Jun 6, 2019 · 14 comments
Open

Include checksum of dependency in dub.selections.json #1723

no-longer-on-githu-b opened this issue Jun 6, 2019 · 14 comments

Comments

@no-longer-on-githu-b
Copy link

no-longer-on-githu-b commented Jun 6, 2019

The selections file only contains version numbers. However, a malicious library author could repoint a tag to a different version, wreaking havoc amongst users.

For example:

  1. I add a new dependency on foo 1.0 to dub.json.
  2. I run a DUB command that causes the dependency to be downloaded, and dub.selections.json to be created.
  3. The library author includes malware in a new commit, and changes the 1.0 tag to this new commit.
  4. I clear the DUB cache and run the DUB command again, which will again download the dependency.
  5. I now have different code than I had before.

By verifying the checksum as part of the second download, this issue is mitigated: the library user needs to audit the dependency only the first time they download it.

@Geod24 Geod24 pinned this issue Aug 9, 2022
@nordlow
Copy link
Contributor

nordlow commented Aug 15, 2022

I would like to see this work for dependencies to all kinds of files including pre-built libraries (lib ...).

Then all dependencies could be queried using their content hash rather name and have them be renamed if needed.

@nordlow
Copy link
Contributor

nordlow commented Aug 15, 2022

When should the selections file be tagged with the hash and should it be automatic or hidden behind a flag? I'm thinking short term behind a flag and long term automatic.

@Geod24
Copy link
Member

Geod24 commented Aug 15, 2022

No switch. It needs to be transparent for users across versions.

@nordlow
Copy link
Contributor

nordlow commented Aug 15, 2022

No switch. It needs to be transparent for users across versions.

Ok, great. So the dub.selections.json file will be changed automatically for every dependency upon initial build?

@Geod24
Copy link
Member

Geod24 commented Aug 15, 2022

No only in the project being built. We can't change dependencies selections. We might want to consider an explicit upgrade otherwise we'll have to blindly accept the checksum I think (?).

@nordlow
Copy link
Contributor

nordlow commented Aug 15, 2022

How should the dub.selections.json be modified? What about

{
	"fileVersion": 1,
	"versions": {
		"concepts": "0.0.8"
	}
}

to

{
	"fileVersion": 1,
	"versions": {
		"concepts": {"version": "0.0.8", "sha256": "$INITIAL_FILE_CONTENT_SHA256_DIGEST"}
	}
}

?

For comparison note that we currently have, for instance

"windows-headers": {"repository":"git+https://github.com/kinke/windows-headers.git","version":"4922dbdf2057ed239c21c4b3161accb7a0695e01"},

@Geod24
Copy link
Member

Geod24 commented Aug 15, 2022

fileVersion needs to be bumped to 2.
And instead of the hash function itself I was thinking of using checksum as field name. Bear in mind we can already have objects instead of the version string, e.g. for path dependencies.

@Geod24
Copy link
Member

Geod24 commented Aug 15, 2022

Actually perhaps none. Since we already accept objects we can make the checksum optionals.

@nordlow
Copy link
Contributor

nordlow commented Aug 15, 2022

Actually perhaps none. Since we already accept objects we can make the checksum optionals.

  1. What do you mean by "accept objects"? Accept dependencies to object files? I can find any reference to objects in https://dub.pm/package-format-sdl.
  2. Ok. Can you elaborate with the contents of an example dub.selections.json?

@Geod24
Copy link
Member

Geod24 commented Aug 16, 2022

This is valid:

{
	"fileVersion": 1,
	"versions": {
		"bitblob": {"path":"submodules/bitblob/"},
		"vibe-d": {"version":"v0.9.5"},
	}
}

So the field type of an entry in versions can be an object, it doesn't have to be a string.
Since, until the last release, we are pretty lax about the format, adding a field means older versions will not care, newer versions will take it into account.

@nordlow
Copy link
Contributor

nordlow commented Aug 16, 2022

So for the hashing shall we use something like

{
	"fileVersion": 1,
	"versions": {
		"bitblob": {"path":"submodules/bitblob/", "checksum":"..."},
		"vibe-d": {"version":"v0.9.5", "checksum":"..."},
	}
}

or

{
	"fileVersion": 1,
	"versions": {
		"bitblob": {"path":"submodules/bitblob/", "sha256":"..."},
		"vibe-d": {"version":"v0.9.5", "sha256":"..."},
	}
}

? Do you prefer "checksum" or "sha256", @kinke @s-ludwig ?

@Geod24
Copy link
Member

Geod24 commented Aug 16, 2022

diff --git a/source/dub/recipe/selection.d b/source/dub/recipe/selection.d
index 097f735b..bfa50a11 100644
--- a/source/dub/recipe/selection.d
+++ b/source/dub/recipe/selection.d
@@ -26,6 +26,7 @@ private struct SelectedDependency
     @Optional @Name("version") string version_;
     @Optional string path;
     @Optional string repository;
+    @Optional string checksum;
 
     public void validate () const scope @safe pure
     {

That would be enough to ensure future-proofness.

@VPanteleev-S7
Copy link

? Do you prefer "checksum" or "sha256", @kinke @s-ludwig ?

How about using [a subset of] the Sub-Resource Integrity standard: https://www.w3.org/TR/SRI/#introduction

This should allow it to be future-proof and compatible with a variety of tools, which could be useful e.g. for packaging.

Also, what exactly will be checksummed? One simple answer is the ZIP file that one gets from https://code.dlang.org/packages/foo/1.2.3.zip. That would only be applicable to name+version dependencies. For Git dependencies, probably the commit SHA1 is enough? And for path dependencies, I don't think a checksum makes sense at all.

@VPanteleev-S7
Copy link

Also, what exactly will be checksummed? One simple answer is the ZIP file that one gets from https://code.dlang.org/packages/foo/1.2.3.zip.

Some options:

  • I believe this file is what GitHub serves for archive downloads of repository trees. The hash is mostly stable: https://github.blog/open-source/git/update-on-the-future-stability-of-source-code-archives-and-hashes/ . GitHub might make changes to the compression settings or algorithms at some point, in which case all hashes will need to be updated.

    This option is the simplest and might be the best compromise.

  • To guarantee that the hash is stable, we could uncompress the archive and hash the contents, with a custom algorithm for recursive hashing of file trees. This is what Nix does: https://nix.dev/manual/nix/2.24/command-ref/nix-hash . I would not recommend this option because in contexts where it's useful to separate dependency acquisition from building, other tools (package managers etc.) would need to implement this hashing algorithm themselves in order to re-use the hashes in dub.selections.json.

  • I am not sure but I think everything uploaded to code.dlang.org originates from Git, in which case, instead of a custom algorithm, we could use Git's to get the SHA1 of the tree. The benefit is that we would be using an existing algorithm, and the hash can be obtained directly from the Git repository (e.g. on GitHub we can clone the repository to get the tree OID, or just use the GitHub API to get a commit's tree OID). The easiest way for Dub to get the hash would be to shell out to git and put the archive contents in a temporary repository, but implementing the hashing by hand is quite simple as well, and can be done in a few dozen lines of code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants