Skip to content

Conversation

@prinskumar-tigergraph
Copy link
Collaborator

@prinskumar-tigergraph prinskumar-tigergraph commented Dec 22, 2025

User description

Graph Lock Utility:
-Added common/utils/graph_locks.py for centralized rebuild lock management.

  • lock for upload/ create ingest and ingest if user in same graph but allowed if user is in another graph
  • lock added for rebuild only one graph at a time until the rebuilt is done

PR Type

Enhancement, Bug fix


Description

  • One JSONL per file, temp storage

  • Add per-graph and global rebuild locks

  • PDF extraction via pymupdf4llm with image LLM

  • UI auto-processes uploads/downloads, preserves warnings


Diagram Walkthrough

flowchart LR
  U["Upload/Download files"] --> CI["create_ingest"]
  CI --> TE["TextExtractor: per-file JSONL in ingestion_temp"]
  TE --> IN["ingest: runLoadingJobWithFile"]
  IN --> TG["TigerGraph loaded"]
  FU["forceupdate"] -- "trigger" --> ECC["ECC service"]
  ECC -- "status polling" --> RL["release rebuild lock"]
  GL["Graph locks"] -- "guard" --> CI
  GL -- "guard" --> IN
  GL -- "guard" --> U
Loading

File Walkthrough

Relevant files
Enhancement
6 files
text_extractors.py
Per-file JSONL output and PDF/image pipeline                         
+255/-146
ui.py
Graph locks and rebuild monitoring added                                 
+184/-14
supportai.py
Ingest from temp JSONL; config fixes                                         
+74/-39 
image_data_extractor.py
Simplify image LLM call; remove legacy                                     
+32/-134
graph_locks.py
Introduce per-graph and global rebuild locks                         
+132/-0 
Setup.tsx
UI auto-processing, new ingest flow and messages                 
+286/-81
Documentation
1 files
pymupdf4llm-AGPL-3.0.txt
Add pymupdf4llm AGPL-3.0 license                                                 
+661/-0 
Dependencies
1 files
requirements.txt
Add pymupdf4llm; bump PyMuPDF version                                       
+2/-1     

- Removed temp_session_id UUID generation from supportai.py
- Temp folders now use consistent path: base_dir/ingestion_temp/graphname
- Fixed delete endpoints to remove corresponding JSONL files when raw files are deleted
- Add logo identification instruction to image LLM prompt
- Use runLoadingJobWithFile for memory-efficient data loading
- Preserve temp JSONL files after ingestion for faster re-ingestion
… files

- Added graph_locks.py utility for managing rebuild lock state
- Fixed lock conflict message vanishing immediately in Setup.tsx by preventing polling from clearing warning messages
- Enabled re-ingestion of already processed files by not clearing ingestJobData after successful ingestion
@prinskumar-tigergraph prinskumar-tigergraph changed the title Gml 2011 graph memory lock GML - 2011 graph memory lock Dec 22, 2025
@prinskumar-tigergraph prinskumar-tigergraph changed the title GML - 2011 graph memory lock GML-2011 graph memory lock Dec 22, 2025
@tg-pr-agent
Copy link

tg-pr-agent bot commented Dec 22, 2025

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ Recommended focus areas for review

Missing Import

The new rebuild monitoring code uses time.sleep but there is no corresponding import time in this module. This will raise a NameError at runtime when forceupdate is called.

while elapsed < max_wait_time:
    time.sleep(poll_interval)
    elapsed += poll_interval

    try:
Ingest Path Mismatch

The UI stores and later sends data_path as the original source folder (uploads/ or downloaded_files_cloud/) instead of the temp JSONL folder created server-side, which the backend ingest now expects. This likely causes ingest to read the wrong location. Align the path used by handleRunIngest/ingest with the temp folder returned by create_ingest.

}

const createData = await createResponse.json();
console.log("Create ingest response:", createData);

// Store ingest job data for later use (store folderPath as source_data_path for temp folder deletion)
  setIngestJobData({
    load_job_id: createData.load_job_id,
    data_source_id: createData.data_source_id,
  data_path: folderPath,  // Use the source folderPath, not the backend's "in_temp_storage"
});
Behavior Change

Small-image and logo filtering were removed for standalone images, which may introduce noise and unnecessary storage/processing. Consider reintroducing filtering or making thresholds/config flags to control this behavior.

pil_image = PILImage.open(file_path)
if pil_image.width < 100 or pil_image.height < 100:
    pass

description = describe_image_with_llm(str(Path(file_path).absolute()))

buffer = io.BytesIO()

Copy link
Collaborator

@chengbiao-jin chengbiao-jin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please keep forceupdate() a non-blocking call.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants