This code is written primarily to fulfill the needs of the Queerlit project, but with the possibility in mind to adapt to other similar projects.
The thesaurus creation and this code repository are described in:
Matsson, A. and Kriström, O. (2023) “Building and Serving the Queerlit Thesaurus as Linked Open Data”, Digital Humanities in the Nordic and Baltic Countries Publications. Oslo, Norway, 5(1), pp. 29–39. doi: 10.5617/dhnbpub.10648.
- Goal: An RDF/SKOS ontology shall be made available online
- Assumption: The source data is a folder of manually edited Turtle (
.ttl
) files adhering to parts of the model - Requirement: Validate source data (warn on broken references etc)
- Requirement: Complement source data (mirror relations, fix
topConceptOf
, etc) - Requirement: HTTP server for full data
This codebase has three parts:
Dependencies can be managed with Conda, see environment.yml. Most importantly, it is based on RDFLib and Flask.
Commit changes to the dev
branch and make sure to keep CHANGELOG.md updated. Data updates with build.py
do not need to be changelogged.
To release:
- Update CHANGELOG.md:
- Determine new version number
- Add a version heading
- Update link hrefs in the bottom
- Commit to
dev
- Push and check the GitHub Actions page to make sure that tests are passing
- Merge
dev
intomain
- Tag the merge commit with the version number prefixed by
v
- Push
main
and the tag - Deploy to server
If there are only data changes, skip the changelog and tag steps (1 and 5).
The thesaurus.py module defines what we expect to be doing with the thesaurus in code. It extends the RDFLib Graph class.
The simple.py module redefines this slightly, in order to provide plain-JSON responses for use with the Queerlit GUI.
- Add to the
.env
file:INDIR="/path/to/ttls" THESAURUSFILE=qlit.nt
- Run
python3 build.py
When there are new terms in the source directory, these will be provided with new canonical ids and reported like:
Creating new identifiers... New id gb58ld43 for stockholmareHBTQI New id om71eq87 for sånaHBTQI
The new ids are saved to qlit.nt
but not in the source files, so the next run will generate new ids again. You must manually edit the source files and replace temporary ids with the new canonical ones.
- Add to the
.env
file:THESAURUSFILE=qlit.nt FLASK_DEBUG=1
- Run
flask run
for development. On the server it is run with gunicorn.
See server.py.
Path | Response |
---|---|
/ |
Full RDF data (see Formats below) |
/<name> |
RDF data for one term (see Formats below) |
/api/term/<name> |
One term as JSON |
/api/labels |
Labels for all terms, keyed by identifiers |
/api/search?s=<str> |
Terms matching a partial label |
/api/collections |
All collections |
/api/collections/<name> |
Terms within the collection <name> |
/api/roots |
All top-level terms |
/api/narrower?broader=<name> |
Terms narrower than the term <name> |
/api/broader?narrower=<name> |
Terms broader than the term <name> |
/api/related?other=<name> |
Terms related to <name> |
The response format for the RDF-oriented routes (i.e. not beginning with /api/
) can be selected with the Accept
header or the format
query param:
curl 'https://queerlit.dh.gu.se/qlit/v1/qd17qs25'
curl 'https://queerlit.dh.gu.se/qlit/v1/qd17qs25?format=jsonld'
curl 'https://queerlit.dh.gu.se/qlit/v1/qd17qs25' \
-H 'Accept: application/rdf+xml'
format param |
MIME type |
---|---|
ttl (default) |
text/turtle |
jsonld |
application/ld+json |
xml |
application/rdf+xml |