Skip to content

Feature/named graph support#432

Merged
KaifAhmad1 merged 7 commits intoHawksight-AI:mainfrom
Sameer6305:feature/named-graph-support
Apr 2, 2026
Merged

Feature/named graph support#432
KaifAhmad1 merged 7 commits intoHawksight-AI:mainfrom
Sameer6305:feature/named-graph-support

Conversation

@Sameer6305
Copy link
Copy Markdown
Contributor

@Sameer6305 Sameer6305 commented Apr 2, 2026

Description

This PR introduces support for SPARQL named graphs to enable logical data partitioning within a single RDF store.

Problem Solved

Previously, all data was stored in a single default graph, which made it difficult to:

  • Isolate environments (dev / staging / prod)
  • Support multi-tenant datasets
  • Track data lineage and trust tiers
  • Perform scoped queries on specific datasets

This limitation required maintaining separate databases for isolation, increasing operational complexity.

Solution

This PR adds native support for named graphs (graph URIs), allowing logical partitioning of data inside the same store using standard SPARQL constructs.


Type of Change

  • New feature (non-breaking change which adds functionality)

Related Issues

Closes #320


Changes Made

  • Added optional graph and graphs parameters to TripletStore APIs for graph-aware read/write operations

  • Extended QueryEngine to support SPARQL dataset clauses (FROM, FROM NAMED)

    • Ensures clauses are injected before the WHERE block for valid SPARQL execution
  • Introduced configuration support for:

    • default_graph_uri
    • Environment-specific graph URIs (e.g., dev, staging, prod)
  • Maintained backward compatibility by defaulting to existing behavior when graph is not provided

  • Added graph isolation tests to ensure:

    • Data written to one graph does not leak into another
  • Updated documentation with examples for:

    • Writing to a named graph
    • Querying specific graphs

Testing

  • Tested locally
  • Added tests for new functionality
  • Package builds successfully (python -m build)

Test Commands

pip install build
python -m build

pytest tests/

Documentation

  • Updated relevant documentation
  • Added code examples for named graph usage
  • Updated API reference for new parameters

Breaking Changes

Breaking Changes: No

This change is fully backward compatible. Existing workflows continue to operate using the default graph when no graph parameter is provided.


Checklist

  • My code follows the project's style guidelines
  • I have performed a self-review of my code
  • My changes generate no new warnings
  • Package builds successfully

Additional Notes

This feature lays the foundation for:

  • Multi-environment deployments within a single RDF store
  • Multi-tenant architectures without database duplication
  • Improved data lineage and version tracking through graph-level isolation

The implementation is intentionally incremental and minimal to align with existing architecture while enabling future extensions in graph routing and change management.

Copilot AI review requested due to automatic review settings April 2, 2026 10:22
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds SPARQL named graph support to the triplet store layer to enable dataset partitioning (multi-tenant / multi-env / lineage) within a single RDF store.

Changes:

  • Added graph / graphs options to TripletStore.execute_query() and forwarded backend named-graph capability to the query engine.
  • Implemented query preparation that injects FROM / FROM NAMED clauses before WHERE (with caching/validation applied to the prepared query).
  • Added configuration/env support for default graph(s) plus tests and documentation examples.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
tests/triplet_store/test_triplet_store.py Adds unit tests for forwarding graph options and query clause injection/isolation behavior.
semantica/triplet_store/triplet_store.py Extends query execution API with graph parameters and named-graph capability defaults.
semantica/triplet_store/query_engine.py Adds query preparation + dataset clause injection integrated into validation/caching/optimization flow.
semantica/triplet_store/config.py Introduces env/default config keys for default graph(s) and enablement flag.
semantica/change_management/managers.py Adjusts graph DROP query formatting during version pruning.
docs/reference/triplet_store.md Documents named graph usage with write/query examples.

πŸ’‘ Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +422 to +424
options.setdefault(
"supports_named_graphs",
self.backend_type in self.NAMED_GRAPH_CAPABLE_BACKENDS,
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

enable_named_graphs is added to config defaults/env parsing, but it is never used when computing supports_named_graphs. As a result, setting TRIPLET_STORE_ENABLE_NAMED_GRAPHS=false has no effect and graph clauses will still be injected/forwarded. Consider incorporating self.config.get("enable_named_graphs", True) into the supports_named_graphs default (and/or have QueryEngine.prepare_query respect it).

Suggested change
options.setdefault(
"supports_named_graphs",
self.backend_type in self.NAMED_GRAPH_CAPABLE_BACKENDS,
enable_named_graphs = getattr(self, "config", {}).get(
"enable_named_graphs", True
)
options.setdefault(
"supports_named_graphs",
enable_named_graphs
and self.backend_type in self.NAMED_GRAPH_CAPABLE_BACKENDS,

Copilot uses AI. Check for mistakes.
resolved_graphs = [resolved_graphs]
resolved_graphs = [g for g in (resolved_graphs or []) if g]

if resolved_graph and resolved_graph not in resolved_graphs:
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The duplicate-avoidance logic is inverted: the comment says to avoid duplicate URIs in FROM NAMED, but the code only filters resolved_graph out of resolved_graphs when resolved_graph is not in the list, which is a no-op. This can lead to injecting both FROM <g> and FROM NAMED <g> when the same URI is provided in both inputs.

Suggested change
if resolved_graph and resolved_graph not in resolved_graphs:
if resolved_graph and resolved_graph in resolved_graphs:

Copilot uses AI. Check for mistakes.
Comment on lines +391 to +394
safe_graph_uri = str(graph_uri).strip().strip("<>")
triplet_store.execute_query(
f"DROP SILENT GRAPH <{safe_graph_uri}>"
)
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graph URI sanitization here is insufficient: stripping </> does not prevent SPARQL injection if graph_uri contains > or other delimiter/control characters. Prefer reusing the same URI sanitization/validation approach used in QueryEngine (percent-encode dangerous characters) or strictly validate graph_uri against an IRI pattern before interpolating it into the SPARQL update.

Copilot uses AI. Check for mistakes.
Comment on lines 110 to +118
env_mappings = {
"TRIPLET_STORE_DEFAULT_STORE": "default_store",
"TRIPLET_STORE_DEFAULT_GRAPH": "default_graph",
"TRIPLET_STORE_DEFAULT_NAMED_GRAPHS": "default_graphs",
"TRIPLET_STORE_BATCH_SIZE": "batch_size",
"TRIPLET_STORE_ENABLE_CACHING": "enable_caching",
"TRIPLET_STORE_CACHE_SIZE": "cache_size",
"TRIPLET_STORE_ENABLE_OPTIMIZATION": "enable_optimization",
"TRIPLET_STORE_ENABLE_NAMED_GRAPHS": "enable_named_graphs",
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR description mentions configuration key default_graph_uri, but the implementation introduces default_graph / default_graphs (and env vars TRIPLET_STORE_DEFAULT_GRAPH, TRIPLET_STORE_DEFAULT_NAMED_GRAPHS). Please align naming (either update docs/description or add backward-compatible alias keys) to avoid confusing users.

Copilot uses AI. Check for mistakes.
@Sameer6305
Copy link
Copy Markdown
Contributor Author

@KaifAhmad1 All checks are passing, please let me know if any changes are needed.

@qodo-code-review
Copy link
Copy Markdown

Code Review by Qodo

Grey Divider

Looking for bugs?

Check back in a few minutes. An AI review agent is analyzing this pull request.

Grey Divider

Qodo Logo

- honor enable_named_graphs flag when forwarding support

- prevent duplicate FROM/FROM NAMED clauses for same graph

- add default_graph_uri compatibility alias

- harden graph URI sanitization in prune DROP GRAPH path

- add regression tests for all fixes

Co-authored-by: Sameer6305 <sskadam6305@gmail.com>

Co-authored-by: KaifAhmad1 <kaifahmad087@gmail.com>
Copilot AI review requested due to automatic review settings April 2, 2026 12:46
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot was unable to review this pull request because the user who requested the review is ineligible. To be eligible to request a review, you need a paid Copilot license, or your organization must enable Copilot code review.

Copy link
Copy Markdown
Contributor

@KaifAhmad1 KaifAhmad1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR #432 Review Summary

Clear Feature Scope

This PR introduces SPARQL named graph support so Semantica can partition RDF data inside a single RDF store using graph URIs.

What It Improves

  • Improves environment isolation: dev, staging, and prod can coexist safely in one store.
  • Improves multi-tenant separation: tenant datasets can be logically isolated without creating separate databases.
  • Improves query precision: consumers can target specific graph partitions via FROM / FROM NAMED.
  • Improves operational simplicity: fewer physical stores to manage while still keeping dataset boundaries.
  • Improves backward compatibility: existing flows continue unchanged when graph parameters are not provided.

Why This Matters

  • Reduces infrastructure complexity and maintenance overhead.
  • Enables safer enterprise deployments with clearer governance boundaries.
  • Makes lineage/trust-tier modeling practical through graph-level partitioning.
  • Provides a foundation for future graph-aware versioning and change management workflows.

Author

Reviewer

Review Outcome

Approved after follow-up fixes.

Key Fixes Applied

  1. enable_named_graphs is now respected when forwarding named-graph capability in TripletStore.execute_query().
  2. Duplicate dataset clauses were fixed so the same graph URI is not emitted as both FROM and FROM NAMED.
  3. Backward-compatible config alias support was added for default_graph_uri.
  4. Graph URI sanitization was hardened in pruning logic before issuing DROP SILENT GRAPH updates.
  5. Focused regression tests were added for all the above fixes.

Validation

  • Targeted feature tests passed:
    • tests/triplet_store/test_triplet_store.py
    • tests/change_management/test_managers.py
  • Result: 54 passed.

@Sameer6305
Copy link
Copy Markdown
Contributor Author

@KaifAhmad1 Thanks for the review and approval! πŸ™Œ

Let me know if anything else needs to be updated from my side πŸ‘

@KaifAhmad1
Copy link
Copy Markdown
Contributor

@KaifAhmad1 Thanks for the review and approval! πŸ™Œ

Let me know if anything else needs to be updated from my side πŸ‘

@Sameer6305 Thanks for the contribution. I’ve fixed the issues and everything looks good now. πŸš€

@KaifAhmad1 KaifAhmad1 merged commit 907f0e8 into Hawksight-AI:main Apr 2, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE] Named Graph Support

3 participants