-
Notifications
You must be signed in to change notification settings - Fork 52
Description
Moderator
Requirement Summary
The final CF documentation artifacts (PDF and HTML) should expose clear, consistent, and standards-friendly metadata suitable for indexing, citation, and long-term archival (e.g. Zenodo, libraries, search engines). At present, metadata is incomplete, partially inconsistent, and not fully aligned with current standards or best practices.
This does not affect the normative content of the CF conventions or the conformance documents.
Technical Proposal Summary
Introduce a minimal and consistent metadata model for the final CF artifacts (PDF and HTML), aligned with widely recognised standards (e.g. Dublin Core, modern HTML metadata, PDF 2.0 / ISO 32000), without changing any normative content of the CF specification.
Benefits
- Better discoverability and indexing of CF documentation.
- Cleaner and more accurate metadata for Zenodo and other archives.
- Clear separation between publication metadata and build/tooling metadata.
- Improved consistency between PDF and HTML artifacts.
Status Quo
Currently:
- PDF
Producercontains incorrect information. According to the PDF standard,ProducerandCreatorare intended to describe the software used to create and produce the PDF, not the document authors. At present, authors are incorrectly listed inProducer. Creatorreflects tooling, which is appropriate, but is not clearly distinguished from content authorship.- PDF metadata does not include
Description,Keywords, or explicit CF version information. - HTML metadata is minimal, tool-generated, and not aligned with standard vocabularies.
- Metadata is not consistently aligned between PDF, HTML, and conformance documents.
- There is no explicit alignment with newer PDF metadata practices (PDF 2.0 / ISO 32000).
Associated pull request
None at present.
Detailed Proposal
This issue proposes to:
-
Define a small, explicit set of metadata fields for CF artifacts:
- Title
- CF version
- Publication date
- Authors and affiliations
- Description
- Keywords
- Persistent identifier (DOI)
- Build timestamp (clearly distinct from publication date)
-
For PDF artifacts:
- Correct the semantics of
ProducerandCreator, ensuring they identify the software toolchain used to generate the PDF, as intended by the PDF standard. - Add missing descriptive metadata (e.g. Description, Keywords).
- Move towards metadata structures compatible with PDF 2.0 / ISO 32000 where feasible.
- Correct the semantics of
-
For HTML artifacts:
- Improve and standardise
<meta>entries. - Align HTML metadata with PDF metadata.
- Use widely recognised, community-standard metadata conventions.
- Improve and standardise
-
Apply the same principles consistently to both the CF specification and the conformance document.
-
Evaluate and implement these improvements using the existing Asciidoctor-based toolchain where possible. The tools already in use appear to support most of the required metadata without major plumbing changes, but this should be confirmed carefully to avoid unnecessary complexity or over-engineering.
All changes should be incremental, tooling-compatible, and non-normative.