-
Notifications
You must be signed in to change notification settings - Fork 32
Description
This field was previously discussed by @ato @nlevitt and @JustAnotherArchivist on an issue in a different repository. That discussion intermixed many topics like the proposed WARC-Protocol field as well as storing X.509 certificates in metadata records. Adding this issue so the idea can be properly discussed and tracked for WARC 1.1+
Proposal
The WARC-Cipher-Suite field is the TLS cipher suite which was used to retrieve any included content. The TLS cipher suite shall be written as the IANA TLS Cipher Suites Value (e.g. TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384).
WARC-Cipher-Suite = "WARC-Cipher-Suite" ":" (cipher)
cipher = <TLS cipher suite value per IANA's TLS Parameters>
The WARC-Cipher-Suite field may be used on ‘response’, ‘resource’, ‘request’, ‘metadata’, and ‘revisit’ records, but shall not be used on ‘warcinfo’, ‘conversion’ or ‘continuation’ records.
Motivation
Storing the TLS parmeters used to retrieve content is valuable for many use cases (research, archival/postierity, troubleshooting). For example, it could provide context why a request doesn't have a corresponding response record. The proposed WARC-Protocol field is used to record the protocol version. WARC-Cipher-Suite field augments this by including what cipher suite was used. As a bonus, the IANA already defines and standardizes the values of these cipher suites, and those values are already used internally by many tools (especially for more modern ciphers).
Background
Per this thread @nlevitt and @ato both liked the idea of recording TLS protocol and cipher info in a WARC file. @nlevitt originally proposed a single custom field that would include both the TLS protocol version and cipher suite that were negotiated. However given that the WARC-Protocol field was being planned separately @ato recommended using WARC-Protocol to record the TLS protocol version and a new field to record the cipher.
Questions
Should the field be namedWARC-Cipher-Suiteto future proof for other uses beyond TLS? TheWARC-Protocolfield defines what protocol is used (FTP, TLS, or even a successor). This cipher suite field is an additional/optional field, applicable only when used with a WARC-Protocol value that supports encryption, recording what cipher suite was used. Baking "TLS" into the field name may cause a problem in the future. (I can't help but think of software and standards that still use the "SSL Certificate" or "SSL connection" terminology 🤮)
Edited 2023-12-19 by @ato Renamed from WARC-TLS-Cipher-Suite to WARC-Cipher-Suite as implemented by @Arkiver2 in Wget-AT and agreed to by @acidus99