Skip to content

HTML Writer: add a way to decide how data- attributes are written #11212

@Delapouite

Description

@Delapouite

Hi

Recently I was converting this kind of Org-Mode document to HTML:

Org input:

* Heading
:properties:
:foo: bar
:manifest: qux
:end:

current HTML output:

<h1 data-foo="bar" manifest="qux" id="heading">Heading</h1>

Subsequent softwares down the line in my data process pipeline were expecting the presence of a data-manifest attribute but failed, because manifest belongs to the list of html5Attributes (which I was not aware of).

It's not the first time this "surprise" happened to my documents (I think last time was a "collision" on the property-name size).

So the thing is, it's hard (or even impossible when you're not in control of the original document) to anticipate beforehand which properties of an Org-Mode drawer belong to the html5Attributes list and will therefore be converted to data--less version of the attribute. Stuffs written in Org Drawers often belong to the business-semantic of the Org document main topic and most of the time should not impact what a potential web browser will automatically do about it.

I would really appreciate if there was an option to have more control on this behavior, that is something like:

  • default - keep the current behavior
  • never - turn all attributes into data- which would produce the result from my sample above :
<h1 data-foo="bar" data-manifest="qux" id="heading">Heading</h1>
  • "both" - turn all attributes into data- on top of the legit ones, which would produce something like :
<h1 data-foo="bar" data-manifest="qux" manifest="qux" id="heading">Heading</h1>

Here's the relevant part of the HTML Writer https://github.com/jgm/pandoc/blob/main/src/Text/Pandoc/Writers/HTML.hs#L689-L696

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions