Skip to content

WARC-Cipher-Suite field proposal #86

@acidus99

Description

@acidus99

This field was previously discussed by @ato @nlevitt and @JustAnotherArchivist on an issue in a different repository. That discussion intermixed many topics like the proposed WARC-Protocol field as well as storing X.509 certificates in metadata records. Adding this issue so the idea can be properly discussed and tracked for WARC 1.1+

Proposal

The WARC-Cipher-Suite field is the TLS cipher suite which was used to retrieve any included content. The TLS cipher suite shall be written as the IANA TLS Cipher Suites Value (e.g. TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384).

WARC-Cipher-Suite = "WARC-Cipher-Suite" ":" (cipher)
cipher          = <TLS cipher suite value per IANA's TLS Parameters>

The WARC-Cipher-Suite field may be used on ‘response’, ‘resource’, ‘request’, ‘metadata’, and ‘revisit’ records, but shall not be used on ‘warcinfo’, ‘conversion’ or ‘continuation’ records.

Motivation

Storing the TLS parmeters used to retrieve content is valuable for many use cases (research, archival/postierity, troubleshooting). For example, it could provide context why a request doesn't have a corresponding response record. The proposed WARC-Protocol field is used to record the protocol version. WARC-Cipher-Suite field augments this by including what cipher suite was used. As a bonus, the IANA already defines and standardizes the values of these cipher suites, and those values are already used internally by many tools (especially for more modern ciphers).

Background

Per this thread @nlevitt and @ato both liked the idea of recording TLS protocol and cipher info in a WARC file. @nlevitt originally proposed a single custom field that would include both the TLS protocol version and cipher suite that were negotiated. However given that the WARC-Protocol field was being planned separately @ato recommended using WARC-Protocol to record the TLS protocol version and a new field to record the cipher.

Questions

  • Should the field be named WARC-Cipher-Suite to future proof for other uses beyond TLS? The WARC-Protocol field defines what protocol is used (FTP, TLS, or even a successor). This cipher suite field is an additional/optional field, applicable only when used with a WARC-Protocol value that supports encryption, recording what cipher suite was used. Baking "TLS" into the field name may cause a problem in the future. (I can't help but think of software and standards that still use the "SSL Certificate" or "SSL connection" terminology 🤮)

Edited 2023-12-19 by @ato Renamed from WARC-TLS-Cipher-Suite to WARC-Cipher-Suite as implemented by @Arkiver2 in Wget-AT and agreed to by @acidus99

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions