Skip to content

Conversation

@fitzthum
Copy link
Member

@fitzthum fitzthum commented Sep 30, 2025

Many people have run into problems using the older kbs clients with current versions of Trustee. Although the KBS Protocol is not backwards compatible (prior to v1.0.0), in this case it is possible to support this behavior with a relatively small set of changes. This does not imply that we will offer similar support for any other versions. It is unlikely that we will. In fact, this PR might not even be merged, but if not you can at least apply it yourself. If we merge this PR, we might drop support for v0.2.0 in the future.

The changes here are as follows.

  • Update the version check to support multiple versions. This is pretty straightforward.
  • Allow v0.4.0 attestation messages to be created from v0.2.0 attestation messages. This is also somewhat clean. See comments for details.
  • Tweak the runtime data generation to reconcile it with v0.2.0 clients which do not have composite evidence. This is the ugliest part of the PR imo.

Overall I think this is probably clean enough to support, especially given how many people have run into problems here. That said, it may differ a little bit from how we implement backwards compatibility in the future. For instance we may want to consider adding versioned endpoints.

Some of these changes could also go in kbs-types, but because the KBS protocol itself does not say anything about backwards comaptibility, I consider this a feature of Trustee and have implemented it here. This could be adjusted in the future.

It's also worth thinking about whether this creates any security issues (i.e. downgrading the protocol to skip attesting devices). I don't think it does, but you may want to ponder this independently.

@fitzthum fitzthum requested a review from a team as a code owner September 30, 2025 15:45
Allow v0.4.0 attestation reports to be created from v0.2.0 reports and
tweak our version check so that we can specify a list of supported
protocol versions rather than just one.

This does not imply that we have a policy of backwards compatibility.
This does not support v0.3.0 (which was never released) and it is very
unlikely that anything before v0.2.0 will ever be supported.

Some of these changes could also go in kbs-types, but because the KBS
protocol itself does not say anything about backwards comaptibility,
I consider this a feature of Trustee and have implemented it here.
This could be adjusted in the future.

Signed-off-by: Tobin Feldman-Fitzthum <[email protected]>
@fitzthum fitzthum force-pushed the kbs-protocol-020-back branch from aea0915 to 0eb2b2e Compare September 30, 2025 15:46
Copy link
Contributor

@mkulke mkulke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have tests with 0.2 fixtures?

@fitzthum
Copy link
Member Author

fitzthum commented Sep 30, 2025

Can we have tests with 0.2 fixtures?

I mentioned this in #907 (comment). I am not sure the best way to implement. I think this is something that we should probably do for v1.0.0+. For v0.2.0 I could go either way.

@gcoon151
Copy link

gcoon151 commented Oct 2, 2025

I love this and hope we offer this up for very selfish reasons.

@lmilleri
Copy link
Member

lmilleri commented Oct 6, 2025

@fitzthum Probably there is more than one thing that can go wrong, not only the kbs_protocol.
I'm running a peer-pod environment in azure that should be compatible with protocol 0.2.0:

[2025-10-06T16:45:12Z ERROR kbs::error] AttestationError(RcarAuthFailed { source: deserialize Request
    
    Caused by:
        unknown variant `azsnpvtpm`, expected one of `az-snp-vtpm`, `az-tdx-vtpm`, `nvidia`, `sev`, `sgx`, `snp`, `tdx`, `cca`, `csv`, `se`, `hygondcu`, `tpm`, `sample`, `sampledevice` at line 1 column 36 })
[2025-10-06T16:45:12Z INFO  actix_web::middleware::logger] 100.64.0.6 "POST /kbs/v0/auth HTTP/1.1" 401 159 "-" "attestation-agent-kbs-client/0.1.0" 0.000143
[2025-10-06T16:45:13Z ERROR kbs::error] AttestationError(RcarAuthFailed { source: deserialize Request

@fitzthum
Copy link
Member Author

fitzthum commented Oct 6, 2025

Probably there is more than one thing that can go wrong, not only the kbs_protocol.
I'm running a peer-pod environment in azure that should be compatible with protocol 0.2.0:

Yes, it is somewhat debatable if this is in scope of the KBS Protocol. Even with this PR there are plenty of ways for things to break. In this case, it would be trivial to allow serde to deserialize this enum from either string. I suggested this on confidential-containers/guest-components#1115 and I would still like to see that implemented, but probably not in this PR.

@fitzthum
Copy link
Member Author

fitzthum commented Oct 6, 2025

That said there might be a deeper incompatibility in the Az evidence. If we fixed the deserialization of the enum I'm not sure if attestation would actually work on Az given some of the crate bumps. cc @mkulke

@mkulke
Copy link
Contributor

mkulke commented Oct 7, 2025

@fitzthum Probably there is more than one thing that can go wrong, not only the kbs_protocol. I'm running a peer-pod environment in azure that should be compatible with protocol 0.2.0:

[2025-10-06T16:45:12Z ERROR kbs::error] AttestationError(RcarAuthFailed { source: deserialize Request
    
    Caused by:
        unknown variant `azsnpvtpm`, expected one of `az-snp-vtpm`, `az-tdx-vtpm`, `nvidia`, `sev`, `sgx`, `snp`, `tdx`, `cca`, `csv`, `se`, `hygondcu`, `tpm`, `sample`, `sampledevice` at line 1 column 36 })
[2025-10-06T16:45:12Z INFO  actix_web::middleware::logger] 100.64.0.6 "POST /kbs/v0/auth HTTP/1.1" 401 159 "-" "attestation-agent-kbs-client/0.1.0" 0.000143
[2025-10-06T16:45:13Z ERROR kbs::error] AttestationError(RcarAuthFailed { source: deserialize Request

yes, this particular issue might be fixable by silently converting the strings for 0.2. evidence, but Tobin's right: there have been various sev crate bumps in between releases that might had an effect on how evidence are being serialized. Also, at some point the vTPM's EKpub was added as part of the evidence, so we had accommodate that in the verifier.

So far there has been a known-to-work permutation of kata+gc+trustee with each CoCo release. If we now want to bolt on backwards compatibility, that's a bit of a chore, but doable, I think.

In any case, to claim backwards compatibility to work (and keep working in the future) we would need unit tests that use v0.2.0 evidence fixtures, no?

@bpradipt
Copy link
Member

bpradipt commented Oct 7, 2025

Would it make sense to target for backwards compatibility from 0.4.0 onwards ?
The reason is to include composite attestation and include guidance on test cases and any other aspects which the contributors need to keep in mind.
This PR will be foundation - like introducing supported versions struct, test cases to catch breakages etc.

@mythi
Copy link
Contributor

mythi commented Oct 7, 2025

Would it make sense to target for backwards compatibility from 0.4.0 onwards ?

What that means needs to be defined first. A couple of corner cases were already discussed: e.g., if an attester changes the evidence format the verifier fails to understand or if the TEE name gets changed. These are opague to the protocol but users will see errors if Trustee/gc versions do not match. IOW, the compatibility goes beyond just the KBS protocol version.

@mkulke
Copy link
Contributor

mkulke commented Oct 7, 2025

Would it make sense to target for backwards compatibility from 0.4.0 onwards ? The reason is to include composite attestation and include guidance on test cases and any other aspects which the contributors need to keep in mind. This PR will be foundation - like introducing supported versions struct, test cases to catch breakages etc.

Maybe like Tobin suggested, just promote 0.4.0 to 1.0.0 coupled with a formal schema that can be enforced and tested against. I personally wouldn't really advocate for that now, since it looks like we're still iterating on a lot of aspects (like should vTPM be a CPU or a device?), but if it's causing real pain for adopters, then maybe it's worth it.

@bpradipt
Copy link
Member

bpradipt commented Oct 7, 2025

Would it make sense to target for backwards compatibility from 0.4.0 onwards ? The reason is to include composite attestation and include guidance on test cases and any other aspects which the contributors need to keep in mind. This PR will be foundation - like introducing supported versions struct, test cases to catch breakages etc.

Maybe like Tobin suggested, just promote 0.4.0 to 1.0.0 coupled with a formal schema that can be enforced and tested against. I personally wouldn't really advocate for that now, since it looks like we're still iterating on a lot of aspects (like should vTPM be a CPU or a device?), but if it's causing real pain for adopters, then maybe it's worth it.

For adopters, we can start publishing a version matrix of guest-components and trustee. This will avoid surprises.

And as for compatibility we can start defining what it means from protocol standpoint, evidence format, attester names, new attesters etc as @mythi mentioned and then have these in the CI to catch breakages.

Before end-of-this year, we will most likely have one more release (0.5.0). So with the foundation in place we can think about that being 1.0.0.

@mkulke
Copy link
Contributor

mkulke commented Oct 7, 2025

For adopters, we can start publishing a version matrix of guest-components and trustee. This will avoid surprises.

that's a good idea, I think we can do this for a few releases back. I think whatever was defined in versions.yaml of kata for a given CoCo release was authoritative, since this was used in e2e tests and also picked up by peerpods.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants