-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Should gitsign attest _just_ store the signed attestation? #623
Comments
I'm not more worried about users fetching predicate data without checking the signature - it's just as easy to IIRC I think the reason why I set this up this way originally was:
That said, 100% agree that we should include tooling to make it easy to do the right thing for accessing the data + verification. |
Hmm, I'm not sure that would work for my use case. One of the things I need to be able to do is take the signed attestation, in a portable format (e.g. signed DSSEs), and send them elsewhere, where they'll get verified by other tools that may not know anything about gitsign. |
Here's some more detailed thoughts (sorry I didn't have time for this earlier)
FWIW the DSSE envelope allows multiple signatures without duplicating the content so, if you standardized on that format then this problem could be solved in that way.
At the very least in my use cases I don't intend to update old data to existing attestations. Instead I'd just be adding new things. So diffing just isn't important to me (but it maybe it is to others?).
I can see how this could be an issue for folks that are just trying to sign raw data. For folks that intend to use in-toto statements this shouldn't be an issue though (and definitely wouldn't be an issue if we allowed them to just provide the statement fully formed so gitsign doesn't need to serialize it). For folks that want to sign raw data, it might be better to not stuff it in an in-toto attestation and instead just stick it directly in a DSSE? (As an aside I will say there's something of a problem in preserving the original blob but signing over the mutated blob, how can folks tell if the original blob was tampered with? The signature would never verify.) But, I should probably focus more on the specific things I'm trying to do, and less on cases I'm not concerned with. So here are the use cases I'm trying to solve:
With those use cases in mind, I suppose it doesn't matter much to me if there's an extra blob in there. But not having the signed in-toto attestation itself with the content embedded within would be an issue. Under the current scheme I suppose I can just ignore that extra blob, but I think there is still room for improvement in how the attestations themselves are stored (2), which is likely a prerequisite for addressing 3 (since it will need to know how things are organized). One of the challenges I see with the existing approach is that storage scheme could perhaps use a bit more definition. Right now it stores the results based on the input filename itself, overwriting any old attestations with the same filename. I think it's quite probable that there will be duplication in filenames and that folks will want to keep both sets of content. At the same time, I don't have any use cases for needing to update previous attestations (which makes diffing support unnecessary for me). There may be a way for us to both get what we want by applying a bit more structure to the storage. Something like:
Example
This setup would allow for command lines like # Just get intoto attestations of a specific predicate type
$ gitsign get-attest abc123 --intoto-pred "https://slsa.dev/verification_summary/v1" --output-intoto-bundle vsa.intoto.jsonl
# Get all intoto attestations
$ gitsign get-attest abc123 --all-intoto --output-intoto-bundle abc123.intoto.jsonl
# Get all the attestations, including the 'raw' ones
$ gitsign get-attest abc123 --all --output-intoto-bundle abc123.intoto.jsonl
# Get all the attestations, but don't bundle them, just output them into a directory indivdiually
$ gitsign get-attest abc123 --all --output-folder abc123-attest/ Sorry, that was a lot... Thoughts? |
FWIW as I was getting coffee I realized the problem with my 'raw' suggestion above (just sticking it in the DSSE) is that it doesn't allow the signature to cover the binding to the relevant git commit... 🤷 |
FWIW I'm not strongly tied to the existing storage setup, mainly wanted to provide some context as to why those choices were originally made. I'm open to storing the whole predicates, and I think the structured vs raw examples you gave could be a good trade-off here! +1 on providing more structure on the storage. I agree with moving to a hash-based filename, though I'm not 100% sold on encoding the predicate type into the path -
My guess is we may want a metadata file along side each file to allow cheap-ish filtering without needing to load the entire predicate. (this could possibly be a single manifest? this would be a bottleneck for writes, but we already have a similar bottleneck for ref updates) e.g.
where metadata.json looks something like: {
"mediaType": "application/vnd.dsse.envelope.v1+json",
"digest": "sha256:def"
"annotations": {
"predicateType": "https://slsa.dev/verification_summary/v1",
"foo": "bar",
}
} This way you can get flexibility of encoding the type so we're not strongly tied to DSSE with optional predicate type sub-data without being constrained by encoding everything into the path, and we'd have similar behavior to cosign (while that it's a strong requirement, is a nice to have). |
That all works for me. :)
Either way, there can certainly be multiple attestations, from the same or different tools, that have the same predicate type. So it's absolutely possible to have collisions. Using the digest of the thing as the name solves that. The solution you outline (having the predicate type in a metadata file instead of in the directory name) works almost as well. The disadvantage is that you can't just "get all the attestations under folder " and instead have to "parse all the metadata files to find matching predicate types". Either way the user can do the same thing though. |
It currently stores both the raw predicate and the signed attestation (which also contains the predicate).
This seems redundant (users can get the predicate from the signed attestation) and potentially dangerous (users can get the predicate data without verifying the signature on the attestation).
So, perhaps it would make sense to only store the signed attestation, and provide an 'easy' way to verify the signature attestation and get the contents?
The text was updated successfully, but these errors were encountered: