Skip to content

Fedict/dcattools

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
Oct 23, 2024
Oct 17, 2024
Oct 23, 2024
Oct 23, 2024
Oct 23, 2024
Oct 23, 2024
Oct 3, 2024
Oct 23, 2024
Aug 19, 2024
Jun 24, 2016
Mar 1, 2018
Jan 29, 2018
Dec 5, 2023
Dec 5, 2023
Dec 5, 2023
Sep 19, 2022
Oct 23, 2024
Sep 24, 2021
Sep 24, 2021
Aug 11, 2021
Oct 23, 2024

Repository files navigation

DCAT tools

Various DCAT tools for harvesting metadata from Belgian open data portals, converting metadata to DCAT-AP files and updating the Belgian data.gov.be portal.

The portal itself is a Drupal 9 website, based on Fedict's Openfed distribution.

Data

Only interested in the result ? The N-Triples and XML files (DCAT-AP) used to update data.gov.be can be found in the dcat repository

Overview of the tools

Components

Requirements

These tools can be used with a Java runtime 17 or newer, on a headless machine, i.e. there is no fancy GUI.

Internet connection is obviously required, although a proxy can be used.

Main parts

  • Helper classes: for storing scraped pages locally, conversion tools etc.
  • Various scrapers: getting metadata from various repositories and websites, and turning the metadata into DCAT files
  • Also part of the scrapers are a series of SPARQL scripts to turn DCAT into DCAT-AP: e.g. map site-specific themes, add missing properties and prepare the files for updating data.gov.be
  • Data.gov.be updater: update the data.gov.be (currently Drupal 7) website using the enhanced DCAT files
  • Some tools: link checker, EDP converter tool

There is also separate, stand-alone RDF validator project which can be used to validate DCAT metadata, regardless if the metadata is to be published on data.gov.be or not.

Steps for updating data.gov.be and the EU Data Portal

  • The various portals (except all) should be harvested using the scrapers.
  • The enhanced files can be uploaded to the data.gov.be portal using the updater
  • Then use all enhancer to merge all the files from the various portals into one file datagovbe.nt
  • Convert the merged file using the EDP tool to an XML file called datagovbe_edp.xml
  • Upload both the datagovbe.nt and datagovbe_edp.xml to github
  • This will be used as input for the European Data Portal (scheduled Thursday morning, every week)

See also the Notes