Skip to content

Handling JSON-LD with duplicate names #459

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
phochste opened this issue May 6, 2025 · 5 comments
Open

Handling JSON-LD with duplicate names #459

phochste opened this issue May 6, 2025 · 5 comments

Comments

@phochste
Copy link

phochste commented May 6, 2025

I was recently reading this blog post: https://alexwlchan.net/2025/duplicate-names-in-json/ which states that the JSON syntax does not require that name strings should be unique. However, these semantics may be considered in further specifications about the use of JSON in data exchanges.

In the JSON-LD specs as far as I know little is said about the handing of duplicate names:

Can the editor clarify how duplicated names should be handled?

{
  "@context": "http://schema.org/",
  "@type": "Person",
  "name": "Jane Doe",
  "jobTitle": "Professor",
  "jobTitle": "Dean",
  "url": "http://www.janedoe.com"
}
@davidlehn
Copy link
Contributor

I'm pretty sure issues like that are considered to be out of scope. JSON-LD isn't defining how JSON is parsed. I suspect mostly every JSON parser will pick one value, or in rare cases throw an error by default. For systems where it's a concern, I'd suggest using a parser that fails on such constructs.

Was there an expectation JSON-LD in particular would behave different than all other JSON based systems?

@phochste
Copy link
Author

phochste commented May 7, 2025

I don't think it is a pure parsing problem but an interpretation problem. As I understand, the JSON spec does not require JSON processors to handle non-unique names in a particular way. Of course, there exist a defacto way ..based on the majority of processors.

Without some clarification this JSON-LD:

{
  "@context": "http://schema.org/",
  "@type": "Person",
  "name": "Jane Doe",
  "jobTitle": "Professor",
  "jobTitle": "Dean",
  "url": "http://www.janedoe.com"
}

could be potentially parsed by different JSON processors and lead to different data models:

_:x a schema:Person ;
    schema:name "Jane Doe";
    schema:jobTitle: "Professor" , "Dean" ;
    schema:url <http://www.janedoe.com/>

or

_:x a schema:Person ;
    schema:name "Jane Doe";
    schema:jobTitle: "Professor" ;
    schema:url <http://www.janedoe.com/>

or

_:x a schema:Person ;
    schema:name "Jane Doe";
    schema:jobTitle: "Dean" ;
    schema:url <http://www.janedoe.com/>

or

(error no data model)

All, depending on the parsing method. Is that intentionally the case for the JSON-LD spec?

My expectation was not that all four interpretations are valid ways to process JSON-LD.

I write this in context of ODRL processig. There I would like only one way to interpret the JSON-LD regardless of underlying JSON processor.

@phochste
Copy link
Author

phochste commented May 7, 2025

I read in https://www.w3.org/TR/json-ld11-api/#terminology:

Terms imported from ECMAScript Language Specification [ECMASCRIPT], The JavaScript Object Notation (JSON) Data Interchange Format [RFC8259], Infra Standard [INFRA], and Web IDL [WEBIDL]

And a bit further in the definition of JSON object:

"In JSON-LD the names in an object must be unique."

However the JSON-LD definition references RFC8258 where object names SHOULD be unique (non-unique is technically valid but discouraged).

Is this "must" in the spec normative? And my previous example MUST be a parsing error?

@TallTed
Copy link
Member

TallTed commented May 7, 2025

"In JSON-LD the names in an object must be unique."

That's problematic. RDF does not place such restrictions — any given subject could have multiple objects for the same predicate, which are not considered to conflict but rather to combine.

How are these to be handled when serializing RDF as JSON-LD?

@dlongley
Copy link
Contributor

dlongley commented May 7, 2025

@TallTed,

"In JSON-LD the names in an object must be unique."

That's problematic. RDF does not place such restrictions — any given subject could have multiple objects for the same predicate, which are not considered to conflict but rather to combine.

How are these to be handled when serializing RDF as JSON-LD?

In JSON-LD, multiple objects for the same predicate are expressed as elements in an array (and the array is the value of a single JSON name (aka "JSON key").

I presume that JSON that contains names that are not unique really only continues to be supported for historical or backwards compatibility purposes. It is not interoperable JSON, see:

https://datatracker.ietf.org/doc/html/rfc8259#section-4

In other words, if you use that kind of JSON, you get no guarantee of inteoperability -- and you may do as you please in your own closed world application.

While I think the current JSON-LD algorithms all operate on an abstract expression of JSON (i.e., the syntax/concrete parsing itself is external), I have wondered in the past if it would be better for JSON-LD to reference "I-JSON" as its basis so that these sorts of corner case questions can be resolved by just pointing to that spec. That spec handles non-interoperable JSON issues such as the one in this issue (and a number of others) by defining a more strict profile:

https://datatracker.ietf.org/doc/html/rfc7493#section-2.3

In practice, I expect very few implementers to use anything other than the built-in JSON parsers in whatever platform they are using -- which is therefore the only sensible basis for wide-scale interoperability.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

No branches or pull requests

4 participants