open-kgo

Open Knowledge Graphs and Ontologies plugin for mloda: nine connector families covering the knowledge-graph landscape, from SPARQL endpoints to SBOMs to agent memory, all behind one declarative Feature interface. Every connector and demo runs offline against in-memory libraries or committed fixtures. No Docker, no network.

At a glance

Section	What you'll find
Quickstart	Run a SPARQL query against a shipped sample file in under a minute
The nine connector families	The core of this repo: a 9-family KG connector taxonomy with two plugins each
Demos	Three marimo notebooks and two evaluation harnesses, all offline
Data and acknowledgments	Where the sample data comes from
Development setup	uv, tox, and the individual checks
Related repositories and documentation	mloda core, the plugin registry, and development guides

Quickstart

Install the connectors and run a SPARQL query against the Turtle sample shipped in this repo:

uv sync --extra kg-all

from pathlib import Path

from mloda.user import DataAccessCollection, Feature, Options, mloda

import open_kgo.feature_groups.kg.rdf.rdflib_sparql as rdf_mod
from open_kgo.compute_frameworks.python_dict_kg_framework import KgPythonDictFramework

# Point at any RDF file. Here: the Turtle sample shipped in this repo.
ttl = Path(rdf_mod.__file__).parent / "tests" / "fixtures" / "sample.ttl"

feature = Feature(
    "rdflib_sparql__knows",
    options=Options(context={
        "query_text": "PREFIX foaf: <http://xmlns.com/foaf/0.1/> "
                      "SELECT ?s ?o WHERE { ?s foaf:knows ?o }",
    }),
)

partitions = mloda.run_all(
    [feature],
    compute_frameworks={KgPythonDictFramework},
    data_access_collection=DataAccessCollection(
        credentials=[{"rdflib_sparql": {"locator": str(ttl), "result_limit": 100}}],
    ),
)

for partition in partitions:
    for row in partition:
        print(row[feature.name])

Swap rdflib_sparql for any of the nine connector families below: same Feature to mloda.run_all shape, different reader.

The nine connector families

open_kgo/feature_groups/kg/ ships a connector taxonomy derived from a 103-system survey. Each family is a shared reader and feature-group base plus two concrete plugins running against in-memory libraries or local file fixtures:

Family	What it connects to	Concrete plugins
`network_pg`	Property-graph databases with a vendor query language (Neo4j, Memgraph, Neptune, ...)	`KuzuCypherReader`, `GrandCypherReader`
`rdf`	RDF triple stores queried with SPARQL	`RdfLibSparqlReader`, `OxigraphSparqlReader`
`embedded`	In-process graph libraries with no network endpoint	`NetworkxEmbeddedReader`, `IGraphEmbeddedReader`
`rest_public`	Public REST (non-SPARQL) KG APIs (OpenAlex, ConceptNet, STRING, ...)	`FileFixtureRestReader`, `FileFixturePagedRestReader`
`lineage`	Metadata and data-lineage graphs (dbt, OpenLineage, DataHub, ...)	`DbtManifestReader`, `OpenLineageReader`
`code_build`	Code, build, and SBOM dependency graphs (CycloneDX, SPDX, ...)	`CycloneDxSbomReader`, `SpdxSbomReader`
`saas_authz`	SaaS and authorization tuple stores (OpenFGA, SpiceDB, Microsoft Graph, ...)	`InProcessTupleStoreReader`, `PaginatedTupleStoreReader`
`agent_memory`	LLM agent memory and GraphRAG graphs (Letta, Zep, Mem0, ...)	`NetworkxMemoryReader`, `GraphWalkMemoryReader`
`citation_rest`	Citation and scientific REST APIs (Reactome, OpenAlex citations, ...)	`FileFixtureCitationReader`, `PaginatedCitationReader`

See open_kgo/feature_groups/kg/README.md for the full family map, the plugin anatomy, and what the prototype does and does not validate.

Install all KG extras with: uv sync --extra kg-all.

One feature per call. KG readers dispatch a single feature per load: every reader rejects a multi-feature FeatureSet rather than silently labelling all rows with one feature name. Request features individually (one Feature per mloda.run_all slot) rather than batching N of them into a single reader call.

No-Docker testing policy. Every connector test runs against rdflib, networkx, kuzu (embedded), or file fixtures. No Docker, no external services, no network calls.

Demos

Three marimo notebooks plus two evaluation harnesses live under demo/:

demo/demo_kg_connectors.py: surface tour of all 9 families against the shipped fixtures.
demo/demo_kg_build_repo.py: builds an RDF graph from this repo (filesystem repo:contains + Python repo:imports), serializes to Turtle, and runs five SPARQL queries through RdfLibSparqlReader via mloda.run_all.
demo/demo_kg_ontology.py: walks the ontology layer end to end.
demo/eval_arch1_vs_arch2.py and demo/eval_qa_accuracy.py: evaluation harnesses comparing plain traversal vs. ontology-guided traversal.

Install the demo extras and open any notebook:

uv sync --extra demo
marimo edit demo/demo_kg_connectors.py

Every demo runs offline against a small committed sample graph: no download, no network, no external services.

Data and acknowledgments

The ontology demo and the two evaluation harnesses run against a small hand-authored sample of public movie facts (demo/data/sample_kb.txt) written in the triple format of the MetaQA dataset (Zhang, Yuyu et al., "Variational Reasoning for Question Answering with Knowledge Graph", AAAI 2018, https://github.com/yuyuz/MetaQA). The sample is committed in this repo and is not derived from the MetaQA dataset files. The notebooks call demo.data.ensure_data() at startup, which builds the sample subgraph offline. To run against the full MetaQA benchmark (licensed under CC BY 3.0, not redistributed here), see demo/data/README.md.

Development setup

Install uv (if not already installed):

curl -LsSf https://astral.sh/uv/install.sh | sh

Create virtual environment and install dependencies:

uv venv
source .venv/bin/activate
uv sync --all-extras

Run all checks with tox:

uv tool install tox --with tox-uv
tox

Run individual checks

pytest
ruff format --check --line-length 120 .
ruff check .
mypy --strict --ignore-missing-imports .
bandit -c pyproject.toml -r -q .

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.github		.github
demo		demo
docs		docs
open_kgo		open_kgo
tests		tests
.gitignore		.gitignore
.releaserc.yaml		.releaserc.yaml
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
THIRD_PARTY_NOTICES.md		THIRD_PARTY_NOTICES.md
pyproject.toml		pyproject.toml
tox.ini		tox.ini
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

open-kgo

At a glance

Quickstart

The nine connector families

Demos

Data and acknowledgments

Development setup

Run individual checks

Related repositories and documentation

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

open-kgo

At a glance

Quickstart

The nine connector families

Demos

Data and acknowledgments

Development setup

Run individual checks

Related repositories and documentation

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages