Skip to content

priya-gitTest/DFDP2

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

61 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🧠 DFDP2 – DICOM to RDF Processing and Visualization Demo

This project is a self-contained web application that demonstrates a complete pipeline for:

  • Processing DICOM files
  • Extracting key metadata
  • Mapping it to semantic ontologies (ROO, SNOMED CT, FOAF, etc.)
  • Generating an in-memory RDF knowledge graph
  • Providing a web-based interface for dataset discovery, SPARQL querying, and graph visualization

Built with Python, the app uses:

  • FastAPI for the web server
  • pydicom for handling DICOM files
  • rdflib for RDF generation and SPARQL querying
  • D3.js for frontend graph visualization

πŸš€ Features

Feature Description
πŸ₯ DICOM Processing Extracting Metadata from DICOM files picked from TCIA
πŸ“‘ Metadata Extraction Extracts Patient ID, Study Date, Modality, Accession Number etc
πŸ“š Semantic Mapping Maps values to ROO, SNOMED CT, and FOAF ontologies
πŸ”— RDF Generation Builds triples and populates an in-memory knowledge graph
πŸ” SPARQL Endpoint Supports SPARQL 1.1 queries via a web form
πŸ“‚ Metadata Catalog Web interface styled after FAIR Data Platforms (Health DCAT-AP)
πŸ•ΈοΈ Knowledge Graph Visualization In-browser graph using force-directed layout

πŸ“¦ Installation

βœ… Prerequisites

  • Python 3.7+

  • pip (Python package manager)

    πŸ™ GitHub Codespaces (Recommended)

Open in GitHub Codespaces

# 1. Open in Codespaces (click badge above)

### πŸ”§ Install Dependencies


```bash
pip install "fastapi[all]" uvicorn  pydicom rdflib requests "python-multipart" Jinja2

▢️ Running the Application

#Load the Dicom images Hugging Face Repository and convert them to DICOM files or download from Cancer Imaging Archive which are now currently stored in dicom_files folder
python fetch_dicom.py # Generates dicom_metadata.json
python map_dicom_complete.py # Generates dicom_mapped_with_catalog.ttl
# then start the FAST API App.
uvicorn main:app --reload

In case you get an error like this : ERROR: [Errno 98] Address already in use Fix it using these commands :

lsof -i :8000 #uvicorn 18407 codespace    3u  IPv4 257296      0t0  TCP localhost:8000 (LISTEN)
kill -9 XXXX # displayed nos next to uvicorn, so 18407

Output : [1]+ Killed uvicorn main:app --reload

  1. Visit: http://127.0.0.1:8000

You’ll see logs like:

INFO:     Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)
INFO:     Started server process [xxxxx]
INFO:     Application startup complete.

🌐 Web Interface Overview

Page URL Description
🏠 Home / Upload DICOMs, view summary
πŸ“š Catalog /catalog View/search processed DICOM datasets
🌐 SPARQL / (form) Query the RDF graph using SPARQL
🧬 Visualize /visualize Interactive graph of datasets and relationships

βš™οΈ How It Works

Run Scripts to

  • Extracting Metadata from DICOM files picked from TCIA and stored in dicom_files/
  • Metadata like PatientID, StudyDate, and Modality are extracted
  • Map to SNOMED CT and ROO terms
  • RDF triples are generated

πŸ§ͺ SPARQL Query Example

Query all DICOM datasets where the modality is CT:

#Query 1
PREFIX dcat: <http://www.w3.org/ns/dcat#>
PREFIX dcterms: <http://purl.org/dc/terms/>
PREFIX dicom: <http://dicom.nema.org/resources/ontology/DCM#>

SELECT ?file ?patientId ?studyDate ?modality ?accessionNumber
WHERE {
  ?dataset a dcat:Dataset ;
           dcat:distribution ?file .
  
  ?file dicom:PatientID ?patientId ;
        dicom:SeriesDate ?studyDate ;
        dicom:Modality ?modality ;
        dicom:AccessionNumber ?accessionNumber .
}
ORDER BY ?studyDate
LIMIT 10

#Query 2 : 
PREFIX dicom: <http://dicom.nema.org/resources/ontology/DCM#>

SELECT DISTINCT ?patientId
WHERE {
  ?subject dicom:Manufacturer "GE MEDICAL SYSTEMS" .
  ?subject dicom:PatientID ?patientId .
}

#Query 3
PREFIX dicom: <http://dicom.nema.org/resources/ontology/DCM#>

SELECT DISTINCT ?manufacturer ?modelName
WHERE {
  ?file dicom:Manufacturer ?manufacturer .
  ?file dicom:ManufacturerModelName ?modelName .
}
ORDER BY ?manufacturer ?modelName
#Query 4
PREFIX dicom: <http://dicom.nema.org/resources/ontology/DCM#>
PREFIX roo: <http://www.cancerdata.org/roo/>

SELECT DISTINCT ?patientId ?bodyPart ?age ?sex ?reasonForStudy
WHERE {
  ?subject dicom:PatientID ?patientId .
  ?subject dicom:BodyPartExamined ?bodyPart .

  OPTIONAL { ?subject roo:hasAge ?age . }
  OPTIONAL { ?subject roo:hasSex ?sex . }
  OPTIONAL { ?subject roo:hasReasonForStudy ?reasonForStudy . }
}
#Query 5
PREFIX dicom: <http://dicom.nema.org/resources/ontology/DCM#>
PREFIX dcat: <http://www.w3.org/ns/dcat#>

SELECT ?patientId (COUNT(?file) AS ?numberOfScans)
WHERE {
  ?dataset a dcat:Dataset ;
           dcat:distribution ?file .
  ?file dicom:PatientID ?patientId .
}
GROUP BY ?patientId
ORDER BY ?patientId

πŸ“š Ontologies Used

Prefix URI
ROO https://www.cancerdata.org/roo-information
SNOMED https://bioportal.bioontology.org/ontologies/SNOMEDCT
DCAT http://www.w3.org/ns/dcat#
FOAF http://xmlns.com/foaf/0.1/
dicom http://dicom.nema.org/resources/ontology/DCM
DCTERMS http://purl.org/dc/terms/

πŸ“ Directory Structure

.
β”œβ”€β”€ main.py             # FastAPI application
β”œβ”€β”€ templates/          # HTML templates (auto-generated)
β”œβ”€β”€ static/             # Static assets (CSS/JS)
β”œβ”€β”€ dicom_files/        # DICOMs downloaded via Script

πŸ“ˆ Visualization

At /visualize, you'll find a D3.js-based graph of the RDF data:

  • Nodes are color-coded by type (e.g., Patient, Modality)
  • Drag nodes to explore relationships
  • Hover over nodes and edges to view URIs and labels

πŸ“Œ To-Do / Ideas for Future

  • Persistent RDF store (e.g., Blazegraph, Fuseki)
  • Support for real-world DICOM tags and vocabularies
  • Authentication for upload and SPARQL features
  • Multi-user catalog and permission system

Screentshots for reference

image image image image

πŸ“„ License

MIT License
Free to use, modify, and distribute with proper attribution.

About

DICOM to RDF Processing and Visualization Demo

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published