Generic CBOR Parsing for EUDIW

**What is your use-case and why do you need this feature?**

Generic Parsing of CBOR structures just like `decodeFromXXX<JsonElement>(…)` is now becoming a must-have for CBOR, because the eIDAS2 regulation (commonly referred to as _EU Digital Identity Wallet_ - EUDIW) mandates the use of ISO/IEC 18013-5:2021 (this format will also be referred to as **ISO mDL**). Note that the ISO  standard is behind a paywall and not freely accessible, so this issue will only quote a very short part of it.

## Detailed Technical Write-Up based on two concrete Examples
### `IssuerSignedItem` as per ISO/IEC 18013-5:2021
This data structure is used during the issuing process in the EUDIW context. Quoting  ISO/IEC 18013-5:2021 Section 8.1:

> RFC 7049, section 3.9 describes four rules for canonical CBOR. Three of those rules shall be implemented for all CBOR structures as follows:
> 
> - *integers (major types 0 and 1) shall be as small as possible;*
> - *the expression of lengths in major types 2 through 5 shall be as short as possible;*
> - *indefinite-length items shall be made into definite-length items.*
> 
> The fourth rule regarding sorting of map keys is not required.

This last bit is the culprit: Some properties of the `IssuerSignedItem`(and their types) depend on another property. Would the fourth rule of canonicalisation be enforced, the type property would occur first and deserialisation would work. After all, if we know the type, we can choose a serialiser. Due to ISO mDL not enforcing this, the type could be the very last property encountered during deserialisation.

Why can't we try to parse every possible type as a cascade of`try-catch` blocks? The reason is that the types that occur in `IssuerSignedItem` may be partially parsed before an error occurs. Hence, part of the bytes are already consumed and lost when a parsing error is thrown, so we cannot try to parse the property at hand as another type inside the `catch` block.

The only possible solution to this problem is currently to rely on [Obor](https://github.com/L-Briand/obor), because it enables us to
1. `decodeFromByteArray<CborObject>`
2.  iterate over all properties inside a generic `CborObject`data structure
3. extract the type property
4. choose a deserialiser based on the type

#### Why this is becoming a Must-Have
[CIR 2024/2982](https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=OJ:L_202402982) Article 5 (referencing its Annex), which is part of the eIDAS2 regulation, mandates the use of  ISO/IEC 18013-5:2021.  
Why is this relevant? The eIDAS2 regulation mandates every member state to implement an identity wallet solution that must be interoperable across the whole European Union. This is relevant **right now** as large-scale pilots are being carried out and the EU-wide go-live is set for [2026](https://ec.europa.eu/digital-building-blocks/sites/display/EUDIGITALIDENTITYWALLET/EU+Digital+Identity+Wallet+Home)!

Without proper support, the default CBOR format provided here will be unfit to support digital identity wallet solutions with a target audience of hundreds of millions.

### COSE Keys
Cose Key Parameters as per [IANA registry for COSE Key Type Parameters](https://www.iana.org/assignments/cose/cose.xhtml#key-type-parameters) use overlapping COSE labels for different data types (e.g. `-1` could be _k_ (`bstr`), _curve_ (`int`/`tstrs`), or _n_ (`bstr`)). The problem is that even when the type of a COSE key is known, (e.g. _RSA_ or _EC2_), certain parameters can have different types under the same label (e.g. `-3` could be of type `bstr` or `bool`).

With **very careful**, tedious manual try-catch parsing, it is still possible to [work around the limitations of the current COSE parser](https://github.com/a-sit-plus/signum/blob/main/indispensable-cosef/src/commonMain/kotlin/at/asitplus/signum/indispensable/cosef/CoseKey.kt#L214) by exploiting the fact that we have a one-byte lookahead that is not advanced, in case the type of the value that is supposed to be parsed next does not match the current byte in the byte stream. However, a slight change in the order of the try-catch - such as trying to first parse a property as a `bstr` instead of an `int` (this is a random example, it might be the other way around) - will consume bytes and make it impossible to recover from an error and try to parse a property as another type (just as it is the case for `IssuerSignedItem`  (see above).

#### Why this is becoming a Must-Have
COSE is mandatory for ISO mDL credentials, as it specifies the implementation details of the security layer (digital signatures of credentials and encryption, etc.). The current workaround is unsustainable and hard to maintain. It also may not cover all possible legal inputs. Some legal inputs could cause an irrecoverable situation.

**Describe the solution you'd like**
Merge Obor into upstream so support generic CBOR parsing. While the specifications are to blame for this mess, because they make single-pass parsing without lookahead impossible, all of those bad decisions are here to stay and will affect a potential user base of hundreds of millions by 2026 at the very latest. We either catch up or we won't be part of what is probably the single largest use case for CBOR yet, backed by a legally binding EU regulation.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Generic CBOR Parsing for EUDIW #2975

Detailed Technical Write-Up based on two concrete Examples

`IssuerSignedItem` as per ISO/IEC 18013-5:2021

Why this is becoming a Must-Have

COSE Keys

Why this is becoming a Must-Have

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Generic CBOR Parsing for EUDIW #2975

Description

Detailed Technical Write-Up based on two concrete Examples

IssuerSignedItem as per ISO/IEC 18013-5:2021

Why this is becoming a Must-Have

COSE Keys

Why this is becoming a Must-Have

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`IssuerSignedItem` as per ISO/IEC 18013-5:2021