Feature/named graph support#432
Conversation
There was a problem hiding this comment.
Pull request overview
Adds SPARQL named graph support to the triplet store layer to enable dataset partitioning (multi-tenant / multi-env / lineage) within a single RDF store.
Changes:
- Added
graph/graphsoptions toTripletStore.execute_query()and forwarded backend named-graph capability to the query engine. - Implemented query preparation that injects
FROM/FROM NAMEDclauses beforeWHERE(with caching/validation applied to the prepared query). - Added configuration/env support for default graph(s) plus tests and documentation examples.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/triplet_store/test_triplet_store.py | Adds unit tests for forwarding graph options and query clause injection/isolation behavior. |
| semantica/triplet_store/triplet_store.py | Extends query execution API with graph parameters and named-graph capability defaults. |
| semantica/triplet_store/query_engine.py | Adds query preparation + dataset clause injection integrated into validation/caching/optimization flow. |
| semantica/triplet_store/config.py | Introduces env/default config keys for default graph(s) and enablement flag. |
| semantica/change_management/managers.py | Adjusts graph DROP query formatting during version pruning. |
| docs/reference/triplet_store.md | Documents named graph usage with write/query examples. |
π‘ Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| options.setdefault( | ||
| "supports_named_graphs", | ||
| self.backend_type in self.NAMED_GRAPH_CAPABLE_BACKENDS, |
There was a problem hiding this comment.
enable_named_graphs is added to config defaults/env parsing, but it is never used when computing supports_named_graphs. As a result, setting TRIPLET_STORE_ENABLE_NAMED_GRAPHS=false has no effect and graph clauses will still be injected/forwarded. Consider incorporating self.config.get("enable_named_graphs", True) into the supports_named_graphs default (and/or have QueryEngine.prepare_query respect it).
| options.setdefault( | |
| "supports_named_graphs", | |
| self.backend_type in self.NAMED_GRAPH_CAPABLE_BACKENDS, | |
| enable_named_graphs = getattr(self, "config", {}).get( | |
| "enable_named_graphs", True | |
| ) | |
| options.setdefault( | |
| "supports_named_graphs", | |
| enable_named_graphs | |
| and self.backend_type in self.NAMED_GRAPH_CAPABLE_BACKENDS, |
| resolved_graphs = [resolved_graphs] | ||
| resolved_graphs = [g for g in (resolved_graphs or []) if g] | ||
|
|
||
| if resolved_graph and resolved_graph not in resolved_graphs: |
There was a problem hiding this comment.
The duplicate-avoidance logic is inverted: the comment says to avoid duplicate URIs in FROM NAMED, but the code only filters resolved_graph out of resolved_graphs when resolved_graph is not in the list, which is a no-op. This can lead to injecting both FROM <g> and FROM NAMED <g> when the same URI is provided in both inputs.
| if resolved_graph and resolved_graph not in resolved_graphs: | |
| if resolved_graph and resolved_graph in resolved_graphs: |
| safe_graph_uri = str(graph_uri).strip().strip("<>") | ||
| triplet_store.execute_query( | ||
| f"DROP SILENT GRAPH <{safe_graph_uri}>" | ||
| ) |
There was a problem hiding this comment.
Graph URI sanitization here is insufficient: stripping </> does not prevent SPARQL injection if graph_uri contains > or other delimiter/control characters. Prefer reusing the same URI sanitization/validation approach used in QueryEngine (percent-encode dangerous characters) or strictly validate graph_uri against an IRI pattern before interpolating it into the SPARQL update.
| env_mappings = { | ||
| "TRIPLET_STORE_DEFAULT_STORE": "default_store", | ||
| "TRIPLET_STORE_DEFAULT_GRAPH": "default_graph", | ||
| "TRIPLET_STORE_DEFAULT_NAMED_GRAPHS": "default_graphs", | ||
| "TRIPLET_STORE_BATCH_SIZE": "batch_size", | ||
| "TRIPLET_STORE_ENABLE_CACHING": "enable_caching", | ||
| "TRIPLET_STORE_CACHE_SIZE": "cache_size", | ||
| "TRIPLET_STORE_ENABLE_OPTIMIZATION": "enable_optimization", | ||
| "TRIPLET_STORE_ENABLE_NAMED_GRAPHS": "enable_named_graphs", |
There was a problem hiding this comment.
PR description mentions configuration key default_graph_uri, but the implementation introduces default_graph / default_graphs (and env vars TRIPLET_STORE_DEFAULT_GRAPH, TRIPLET_STORE_DEFAULT_NAMED_GRAPHS). Please align naming (either update docs/description or add backward-compatible alias keys) to avoid confusing users.
|
@KaifAhmad1 All checks are passing, please let me know if any changes are needed. |
- honor enable_named_graphs flag when forwarding support - prevent duplicate FROM/FROM NAMED clauses for same graph - add default_graph_uri compatibility alias - harden graph URI sanitization in prune DROP GRAPH path - add regression tests for all fixes Co-authored-by: Sameer6305 <sskadam6305@gmail.com> Co-authored-by: KaifAhmad1 <kaifahmad087@gmail.com>
KaifAhmad1
left a comment
There was a problem hiding this comment.
PR #432 Review Summary
Clear Feature Scope
This PR introduces SPARQL named graph support so Semantica can partition RDF data inside a single RDF store using graph URIs.
What It Improves
- Improves environment isolation:
dev,staging, andprodcan coexist safely in one store. - Improves multi-tenant separation: tenant datasets can be logically isolated without creating separate databases.
- Improves query precision: consumers can target specific graph partitions via
FROM/FROM NAMED. - Improves operational simplicity: fewer physical stores to manage while still keeping dataset boundaries.
- Improves backward compatibility: existing flows continue unchanged when graph parameters are not provided.
Why This Matters
- Reduces infrastructure complexity and maintenance overhead.
- Enables safer enterprise deployments with clearer governance boundaries.
- Makes lineage/trust-tier modeling practical through graph-level partitioning.
- Provides a foundation for future graph-aware versioning and change management workflows.
Author
Reviewer
Review Outcome
Approved after follow-up fixes.
Key Fixes Applied
enable_named_graphsis now respected when forwarding named-graph capability inTripletStore.execute_query().- Duplicate dataset clauses were fixed so the same graph URI is not emitted as both
FROMandFROM NAMED. - Backward-compatible config alias support was added for
default_graph_uri. - Graph URI sanitization was hardened in pruning logic before issuing
DROP SILENT GRAPHupdates. - Focused regression tests were added for all the above fixes.
Validation
- Targeted feature tests passed:
tests/triplet_store/test_triplet_store.pytests/change_management/test_managers.py
- Result: 54 passed.
|
@KaifAhmad1 Thanks for the review and approval! π Let me know if anything else needs to be updated from my side π |
@Sameer6305 Thanks for the contribution. Iβve fixed the issues and everything looks good now. π |

Description
This PR introduces support for SPARQL named graphs to enable logical data partitioning within a single RDF store.
Problem Solved
Previously, all data was stored in a single default graph, which made it difficult to:
This limitation required maintaining separate databases for isolation, increasing operational complexity.
Solution
This PR adds native support for named graphs (graph URIs), allowing logical partitioning of data inside the same store using standard SPARQL constructs.
Type of Change
Related Issues
Closes #320
Changes Made
Added optional
graphandgraphsparameters to TripletStore APIs for graph-aware read/write operationsExtended QueryEngine to support SPARQL dataset clauses (
FROM,FROM NAMED)Introduced configuration support for:
default_graph_uriMaintained backward compatibility by defaulting to existing behavior when graph is not provided
Added graph isolation tests to ensure:
Updated documentation with examples for:
Testing
python -m build)Test Commands
Documentation
Breaking Changes
Breaking Changes: No
This change is fully backward compatible. Existing workflows continue to operate using the default graph when no graph parameter is provided.
Checklist
Additional Notes
This feature lays the foundation for:
The implementation is intentionally incremental and minimal to align with existing architecture while enabling future extensions in graph routing and change management.