Capture the schema of the underlying Dataset #4

AlasdairGray · 2023-04-27T12:32:47Z

The schema of a Dataset helps a technical acquirer to understand and assess the data.

Extend the exchange model to capture the schema of the Dataset.

DCAT recommends using the dct:conformsTo property for capturing the schema of the Dataset, see §6.4.2 of DCATv3

The text was updated successfully, but these errors were encountered:

AlasdairGray · 2023-08-16T09:58:48Z

DCAT does not give any guidance on how to capture the schema of the underlying dataset. We will need to support a wide variety of dataset formats including CSV, JSON, geoJSON, and XML.

To enable applications such as the Data Marketplace to be able to exploit the schema level information, it would be beneficial to have an agreed approach, but this is likely to be different depending upon the dataset media type.

For tabular data there is a government recommendation to use CSV to share this data and also a recommendation to use CSVW (CSV for the Web) to capture the metadata. CSVW is a recommendation for sharing CSV files and is capable of modelling the column headings and relationships between them. This would allow for the use of CVSW processing tools to manipulate the metadata.

For XML and JSON, there exist XML Schema and JSON Schema respectively. These can be published on the web and the dct:conformsTo property could link to the file (or we could investigate embedding it within the metadata). The schema information can then be processed using standard tooling available in multiple languages. This approach also means that the metadata publisher will not need to do additional modelling for their schema level metadata.

AlasdairGray added Model MIWG labels Aug 9, 2023

RobNicholsGDS added new requirement discussion General discussion points labels Nov 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Capture the schema of the underlying Dataset #4

Capture the schema of the underlying Dataset #4

AlasdairGray commented Apr 27, 2023

AlasdairGray commented Aug 16, 2023

Capture the schema of the underlying Dataset #4

Capture the schema of the underlying Dataset #4

Comments

AlasdairGray commented Apr 27, 2023

AlasdairGray commented Aug 16, 2023