Skip to content

YAML-LD? #389

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
VladimirAlexiev opened this issue Apr 1, 2022 · 11 comments
Closed

YAML-LD? #389

VladimirAlexiev opened this issue Apr 1, 2022 · 11 comments

Comments

@VladimirAlexiev
Copy link

VladimirAlexiev commented Apr 1, 2022

I thought I knew JSON-LD.

But then I saw this DOAP example at https://github.com/common-workflow-language/common-workflow-language/wiki/Related-ontologies. Compare to an actual Turtle of a Debian package: https://packages.qa.debian.org/b/bowtie.ttl

"@context":
  "foaf": "http://xmlns.com/foaf/0.1/"
  "doap": "http://usefulinc.com/ns/doap"
  "adms": "http://purl.org/adms/"
  "admssw": "http://purl.org/adms/sw/"

adms:Asset
  admssw:SoftwareProject
    doap:name: "STAR"
    doap:description: >
      Aligns RNA-seq reads to a reference genome using uncompressed suffix arrays.
      STAR has a potential for accurately aligning long (several kilobases) reads that are
      emerging from the third-generation sequencing technologies.
    doap:homepage: "https://github.com/alexdobin/STAR"
    doap:repository:
      - doap:GitRepository:
        doap:location: "https://github.com/alexdobin/STAR.git"
    doap:release:
      - doap:revision: "2.5.0a"
    doap:license: "GPL"
    doap:category: "commandline tool"
    doap:programming-language: "C++"
    foaf:Organization:
      - foaf:name: "Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA"
      - foaf:name: "2Pacific Biosciences, Menlo Park, CA, USA"
    foaf:publications:
      - foaf:title: "(Dobin et al., 2013) STAR: ultrafast universal RNA-seq aligner. Bioinformatics."
        foaf:homepage: "http://www.ncbi.nlm.nih.gov/pubmed/23104886"
    doap:developer:
      - foaf:Person:
        foaf:name: "Alexander Dobin"
        foaf:mbox: "mailto:dobin at cshl.edu"
        foaf:fundedBy: "This work was funded by NHGRI (NIH) grant U54HG004557"
  adms:AssetDistribution
    doap:name: "STAR.cwl"
    doap:specification: "http://common-workflow-language.github.io/draft-3/"
    doap:release: "cwl:draft-3.dev2"
    doap:homepage: "https://github.com/common-workflow-language/workflows/blob/master/tools/STAR.cwl"

And I'm like WHAT MAGIC is this?

  • Ok, a key with empty value is taken to be the rdf:type (@type)
  • But it seems to me that the properties connecting sub-objects are often missing.

@stain @mr-c can you shed some light?

Googled YAML-LD and only saw a brief discussion again by the CWL people: https://lists.w3.org/Archives/Public/public-linked-json/2015Jan/0035.html

The JSON-LD spec says

Although not discussed in this specification, parallel work using YAML could be used to map into the internal representation, allowing the JSON-LD 1.1 API to operate as if the source was a JSON document.

Despite no official support in the spec, the gazillion JSON-LD conformance tests are written in YAML: https://github.com/w3c/json-ld-syntax/tree/main/yaml .

I know that YAML can be trivially converted to JSON and thereon to JSONLD. But as per the above discussion, it would be nice to get rid of the need to write those pesky @.
@gkellogg is there some convention for that, or the CWL example is not widely adopted?

@pchampin
Copy link
Contributor

pchampin commented Apr 4, 2022

I concur that this example from CWL is really strange, in many respects:

  • first of all, it is not legal YAML (colons are missing after adms:Asset, admssw:SoftwareProject and adms:AssetDistribution)
  • then, converted to JSON and interpreted as JSON-LD, it produces a graph that is probably not the expected result:
    • things that look like class names (adms:Asset, admssw:SoftwareProject and adms:AssetDistribution) end up as predicates
    • some of them (doap:GitRepository) are even ignored, because of their empty value (no, it does not magically convert them to types)

It looks like some part of the context was forgotten. For example, adding

      "adms:Asset": {@container: @type}

turns admssw:SoftwareProject and adms:AssetDistribution into the types of the corresponding nodes, which seems to make sense. But the empty keys for types are a mystery to me.

@mr-c
Copy link

mr-c commented Apr 4, 2022

Hello all. That snippet is not part of the CWL recommended practices (the CWL standards recommend using schema.org annotations). Looks like @portah authored it so I'll leave that to him to explain

The CWL standards themselves are defined in the schema-salad language, created by @tetron in 2015 to be a sort of YAML-LD https://github.com/common-workflow-language/schema_salad/

@anatoly-scherbakov
Copy link

I have been using something that I am calling YAML-LD, for a while.

  • In order to get rid of the @, I am replacing them with $. That allows to avoid quotes.
  • I am heavily relying on JSON-LD contexts to reduce the size of the files.
  • I am also heavily relying on OWL RL to skip the properties which can be inferred.
  • This is a matter of taste, but I like to use Unicode characters to convey meaning in YAML-LD files.

In this example, I am defining rdfs:range and rdfs:domain properties for a handful of terms from an external ontology.

$context:
  skill: https://schema.jsonldresume.org/#
  schema: https://schema.org/
  rdfs: http://www.w3.org/2000/01/rdf-schema#
  :
    $id: rdfs:domain
    $type: $id
  :
    $id: rdfs:range
    $type: $id
  links:
    $id: skills:links
    $container: $id
links:
  skill:assesses:
    : skill:Award
    : schema:Text
  skill:award:
    : skill:Resume
    : skill:Award
  ...

Purpose: I use YAML-LD to write data files, which then are used to build a semantic graph, which then is used to build pages of a static website. Example: source files directoryresulting page.

The code is essentially a plugin for MkDocs site generator, is open source, and available at github.

@VladimirAlexiev
Copy link
Author

@msporny @gkellogg @pchampin @mr-c @anatoly-scherbakov @ericprud

Are there enough people who'd like to work out a standardization of YAML-LD?

YAML is nicer than JSON, yet more powerful. I think that standardization of YAML-LD can be very useful.

Also cc @andimou , @pheyvaer re YARRML.

Manu and Gregg and P-A what's the best way to start this?

@gkellogg
Copy link
Member

gkellogg commented May 15, 2022

There is certainly space to do it within the JSON-LD CG, and there has been some interest over time, but not enough to get anyone to work on it seriously, if you'd like to do so, then the best place for issues and to make progress would be in the json-ld.org repository.

Ultimately, these would need to be in their own repository (perhaps https://github.com/json-ld/yaml-ld) which can use the various infrastructure for rendering specifications and hosting test suites.

Note that the premise most of us have been working against is the basic YAML serialization of JSON-LD, and as you've likely noted, the -syntax and -api repos have all of the examples in the spec automatically transformed into hypothetical yaml-ld. But, there are a number of YAML-specific details that likely need to be attended to.

Note that the JSON-LD specs generally call for parsing into an intermediate representation. I don't think we contemplated replacing @ for $ in keywords, and I'm not sure this would achieve consensus, but it could reasonably be contained to a YAML-LD parsing layer.

If you like, I can move this issue to json-ld.org, and we can poll to see if there's enough interest to work on this in the CG. If so, I can facilitate creating the repository, and that repo would be a good place for discussions, along with [email protected].

Edit: replaced the erroneous https://github.com/json-ld/csv-ld with https://github.com/json-ld/yaml-ld.

@anatoly-scherbakov
Copy link

I would propose the following grounds for the @$ replacement: while JSON is machine readable and writable, it is not very human readable and — especially — writable. YAML is much friendlier in that regard, due to much lower syntactic noise. The replacement of these characters helps to further reduce the said syntactic noise, making the data files therefore faster to type.

In general, I'd name manually writable semantic data the main purpose for YAML-LD.

I will be happy to participate in the standardization process if one is to be initiated, and to assist however I can.

@VladimirAlexiev
Copy link
Author

@gkellogg thanks for your support! Yes, please move the issue and post a poll.
I can help in the standardization process.

need to be in their own repository (perhaps https://github.com/json-ld/csv-ld)

Is that a lapsus lingua and you meant yaml-ld?

More importantly, is there a csv-ld separate from the very excellent CSVW?

@tetron
Copy link

tetron commented May 16, 2022

@VladimirAlexiev I made a couple updates to the original page to fix syntax errors:

https://github.com/common-workflow-language/common-workflow-language/wiki/Related-ontologies

Here's some more information about Schema Salad:

https://www.commonwl.org/v1.2/SchemaSalad.html

JSON-LD is a W3C standard providing a way to describe how to interpret a JSON document as Linked Data by means of a "context". JSON-LD provides a powerful solution for representing object references and namespaces in JSON based on standard web URIs, but is not itself a schema language. Without a schema providing a well defined structure, it is difficult to process an arbitrary JSON-LD document as idiomatic JSON because there are many ways to express the same data that are logically equivalent but structurally distinct.

Several schema languages exist for describing and validating JSON data, such as the Apache Avro data serialization system, however none understand linked data. As a result, to fully take advantage of JSON-LD to build the next generation of linked data applications, one must maintain separate JSON schema, JSON-LD context, RDF schema, and human documentation, despite significant overlap of content and obvious need for these documents to stay synchronized.

Schema Salad is designed to address this gap. It provides a schema language and processing rules for describing structured JSON content permitting URI resolution and strict document validation. The schema language supports linked data through annotations that describe the linked data interpretation of the content, enables generation of JSON-LD context and RDF schema, and production of RDF triples by applying the JSON-LD context. The schema language also provides for robust support of inline documentation.

Schema Salad is more complex than a 1:1 mapping of JSON-LD elements to YAML, but might be useful to people.

@gkellogg
Copy link
Member

@gkellogg thanks for your support! Yes, please move the issue and post a poll. I can help in the standardization process.

need to be in their own repository (perhaps https://github.com/json-ld/csv-ld)

Is that a lapsus lingua and you meant yaml-ld?

More importantly, is there a csv-ld separate from the very excellent CSVW?

Yes, indeed. CSV-LD was an early proposal for what became CSV on the Web. We wrote up something about it here a long time ago. I do think that CSVW needs to be revisited, and there is a nascent CG for it, but it hasn't amassed enough critical mass.

@gkellogg
Copy link
Member

I wasn't able to transfer this issue, but created json-ld/yaml-ld#3 for further discussion. Please redirect future discussion there, and provide support for creating new CG work.

@w3c w3c locked as resolved and limited conversation to collaborators May 16, 2022
@gkellogg
Copy link
Member

A repository has been created to push forward work in the CG: https://github.com/json-ld/yaml-ld.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants