Skip to content

Conversation

@RaviTeja799
Copy link

@RaviTeja799 RaviTeja799 commented Jan 5, 2026

Description

This PR addresses the thread-safety issue in Job.get_root() (Issue #3098). When multiple concurrent requests access the job tree structure, a MultipleObjectsReturned exception can occur. This fix replaces the previous silent workaround with a deterministic, logged fallback.

Type of change

  • Bug fix (non-breaking change which fixes an issue)

Key Improvements

  • Deterministic Ordering: Added .order_by("pk").first() in the exception handler to ensure consistent results across all concurrent requests.
  • Explicit Logging: Replaces the silent failure with logger.error, providing the Job PK and path for better data integrity monitoring.
  • Instance Bug Fix: Corrected self.objects to type(self).objects, as model instances do not have an objects attribute.
  • Deadlock Prevention: Intentionally avoided select_for_update() to prevent potential database deadlocks.

Technical Decision: Why NOT select_for_update()?

I have intentionally avoided select_for_update() to prevent potential database deadlocks in high-concurrency environments, especially given IntelOwl's use of multiple Celery workers. The locking approach:

  1. Can cause deadlocks under high concurrency.
  2. Adds unnecessary overhead for read operations.
  3. Doesn't actually fix the root cause (which is in django-treebeard's tree modification operations).

This fix prioritizes deterministic ordering and explicit error logging to resolve the issue safely without risking infrastructure stability.

Testing

I've added four new test cases in tests/api_app/test_models.py:

  1. test_get_root_returns_self_when_is_root
  2. test_get_root_returns_parent_for_child_job
  3. test_get_root_deterministic_ordering
  4. test_get_root_handles_multiple_roots_deterministically (Simulates tree corruption to verify the fix).

How to Test

You can verify the fix by running these specific tests:

python manage.py test tests.api_app.test_models.JobTestCase.test_get_root_deterministic_ordering
python manage.py test tests.api_app.test_models.JobTestCase.test_get_root_handles_multiple_roots_deterministically

Checklist:

  • I have read and understood the rules about how to Contribute to this project
  • The pull request is for the branch develop
  • All unit tests passed locally.

Copilot AI review requested due to automatic review settings January 5, 2026 01:23
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a thread-safety issue in Job.get_root() where concurrent requests could trigger a MultipleObjectsReturned exception due to race conditions in django-treebeard's tree structure. The fix implements deterministic ordering and explicit error logging to handle corrupted tree states gracefully.

Key Changes:

  • Replaced silent exception handling with deterministic .order_by("pk").first() fallback
  • Added comprehensive error logging to track tree integrity issues
  • Fixed instance bug: corrected self.objects to type(self).objects
  • Added four new test cases covering normal operation, child job traversal, and corrupted tree scenarios

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
api_app/models.py Updated get_root() method with deterministic ordering, proper logging, and detailed comments explaining the design decision to avoid select_for_update()
tests/api_app/test_models.py Added comprehensive test coverage including edge case simulation of tree corruption with multiple roots having the same path

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 661 to 673
# Cleanup
child_job.delete()
duplicate_root.delete()
root_job.delete()
an.delete()
Copy link

Copilot AI Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cleanup order could be simplified. Since the Analyzable has CASCADE delete for Jobs (api_app/models.py line 367), deleting the Analyzable will automatically cascade delete all related Jobs. Consider deleting just the Analyzable at the end, which would handle all three jobs automatically and ensure proper cascade order.

Copilot uses AI. Check for mistakes.
@RaviTeja799
Copy link
Author

Hi @mlodic, the formatting issues are now fixed and the PR is ready for review. Could you please approve the workflow runs? @copilot

@RaviTeja799
Copy link
Author

Hi @mlodic, I am extremely sorry the failing of backend test cases This is my first PR , I am new , trying to adapt and willing to learn please give me some time I will fix the things

@RaviTeja799
Copy link
Author

RaviTeja799 commented Jan 7, 2026

I've updated the tests and fixed the Black and Flake8style issues. All tests pass locally. Could you @drosetti @mlodic please trigger the CI workflows for the latest commit? Thanks!

@mlodic
Copy link
Member

mlodic commented Jan 8, 2026

@RaviTeja799 please fix CI issue

@RaviTeja799 RaviTeja799 force-pushed the fix/issue-3098-thread-safety branch from 207cd40 to 725ee9b Compare January 8, 2026 18:51
.order_by("pk")
.first()
)
logger.error(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would just raise a warning here, we don't know how much noisy this can be

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @mlodic!

That is a valuable suggestion thanks for the help and support. I would definitely follow you advice to avoid creating unnecessary noise in the logs. I've updated the code to use logger.warning instead of logger.error as you suggested.

I also updated the new test case to reflect this change so the CI stays green. Thanks for the feedback!

@mlodic
Copy link
Member

mlodic commented Jan 9, 2026

asking another review from another maintainer @fgibertoni @0ssigeno

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants