-
-
Notifications
You must be signed in to change notification settings - Fork 547
Fix: Thread-safe get_root() with deterministic ordering and proper logging (Issue #3098) #3158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Fix: Thread-safe get_root() with deterministic ordering and proper logging (Issue #3098) #3158
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR fixes a thread-safety issue in Job.get_root() where concurrent requests could trigger a MultipleObjectsReturned exception due to race conditions in django-treebeard's tree structure. The fix implements deterministic ordering and explicit error logging to handle corrupted tree states gracefully.
Key Changes:
- Replaced silent exception handling with deterministic
.order_by("pk").first()fallback - Added comprehensive error logging to track tree integrity issues
- Fixed instance bug: corrected
self.objectstotype(self).objects - Added four new test cases covering normal operation, child job traversal, and corrupted tree scenarios
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| api_app/models.py | Updated get_root() method with deterministic ordering, proper logging, and detailed comments explaining the design decision to avoid select_for_update() |
| tests/api_app/test_models.py | Added comprehensive test coverage including edge case simulation of tree corruption with multiple roots having the same path |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # Cleanup | ||
| child_job.delete() | ||
| duplicate_root.delete() | ||
| root_job.delete() | ||
| an.delete() |
Copilot
AI
Jan 5, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cleanup order could be simplified. Since the Analyzable has CASCADE delete for Jobs (api_app/models.py line 367), deleting the Analyzable will automatically cascade delete all related Jobs. Consider deleting just the Analyzable at the end, which would handle all three jobs automatically and ensure proper cascade order.
|
Hi @mlodic, the formatting issues are now fixed and the PR is ready for review. Could you please approve the workflow runs? @copilot |
|
Hi @mlodic, I am extremely sorry the failing of backend test cases This is my first PR , I am new , trying to adapt and willing to learn please give me some time I will fix the things |
|
@RaviTeja799 please fix CI issue |
207cd40 to
725ee9b
Compare
api_app/models.py
Outdated
| .order_by("pk") | ||
| .first() | ||
| ) | ||
| logger.error( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would just raise a warning here, we don't know how much noisy this can be
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @mlodic!
That is a valuable suggestion thanks for the help and support. I would definitely follow you advice to avoid creating unnecessary noise in the logs. I've updated the code to use logger.warning instead of logger.error as you suggested.
I also updated the new test case to reflect this change so the CI stays green. Thanks for the feedback!
|
asking another review from another maintainer @fgibertoni @0ssigeno |
Description
This PR addresses the thread-safety issue in
Job.get_root()(Issue #3098). When multiple concurrent requests access the job tree structure, aMultipleObjectsReturnedexception can occur. This fix replaces the previous silent workaround with a deterministic, logged fallback.Type of change
Key Improvements
.order_by("pk").first()in the exception handler to ensure consistent results across all concurrent requests.logger.error, providing the Job PK and path for better data integrity monitoring.self.objectstotype(self).objects, as model instances do not have anobjectsattribute.select_for_update()to prevent potential database deadlocks.Technical Decision: Why NOT select_for_update()?
I have intentionally avoided
select_for_update()to prevent potential database deadlocks in high-concurrency environments, especially given IntelOwl's use of multiple Celery workers. The locking approach:django-treebeard's tree modification operations).This fix prioritizes deterministic ordering and explicit error logging to resolve the issue safely without risking infrastructure stability.
Testing
I've added four new test cases in
tests/api_app/test_models.py:test_get_root_returns_self_when_is_roottest_get_root_returns_parent_for_child_jobtest_get_root_deterministic_orderingtest_get_root_handles_multiple_roots_deterministically(Simulates tree corruption to verify the fix).How to Test
You can verify the fix by running these specific tests:
Checklist:
develop