-
Notifications
You must be signed in to change notification settings - Fork 3.5k
Fix: prevent duplicate image overwriting when exporting projects (#8076) #9929
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Fix: prevent duplicate image overwriting when exporting projects (#8076) #9929
Conversation
|
| names[(subset, name_base)] += 1 | ||
| return osp.extsep.join([name_base, ext]) | ||
|
|
||
| if task_id is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems unnecessary complex. Why not just append the task id if it exists and keep the original logic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure I'll update the codebase.
|
This PR combined with the description seems to be fully written by an LLM. If you write things like that, you should be reviewing it before forcing others to do so |
|
Sorry won't happen again I'll manually write the code and send a PR |
zhiltsov-max
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for sending the PR, please check the comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please avoid changing the file directly, there is an explanatory comment in the beginning of the file.
|
|
||
| def mangle_image_name(name: str, subset: str, names: defaultdict[tuple[str, str], int]) -> str: | ||
| name, ext = name.rsplit(osp.extsep, maxsplit=1) | ||
| def mangle_image_name(name: str, subset: str, names: defaultdict[tuple[str, str], int], task_id: Optional[int] = None) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| def mangle_image_name(name: str, subset: str, names: defaultdict[tuple[str, str], int], task_id: Optional[int] = None) -> str: | |
| def mangle_image_name(name: str, subset: str, names: defaultdict[tuple[str, str], int], *, task_id: int) -> str: |
| if not names[(subset, mangled_base)]: | ||
| names[(subset, name_base)] += 1 | ||
| names[(subset, mangled_base)] += 1 | ||
| return osp.extsep.join([mangled_base, ext]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This block can probably be merged with the while body below.
|
|
||
| if not names[(subset, mangled_base)]: | ||
| names[(subset, name_base)] += 1 | ||
| names[(subset, mangled_base)] += 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should probably be done in the while block below as well.
|
|
||
| i = 1 | ||
| while i < sys.maxsize: | ||
| mangled_base = f"{name_base}_{names[(subset, name_base)]}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This variable can discard the mangled_base defined above, effectively restoring the old behavior. Consider doing mangled_name = f"{mangled_base}_{names[(subset, name_base)]}" instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding a test for this problem with export, it should probably be added in tests/python/rest_api/test_tasks.py.



Motivation and context
The issue addressed is #8076, which describes a problem where image names could conflict during dataset export when multiple images share the same name across tasks or jobs.
This caused overwriting or inconsistent naming during export.
This PR modifies the
mangle_image_namefunction to handle duplicate image names more gracefully:task_idis provided, the filename now includes-task_{task_id}._1,_2, etc.) is appended.This ensures consistent and conflict-free exports, especially in multi-task environments.
How has this been tested?
Checklist
developbranchI have updated the documentation accordinglyI have added tests to cover my changesLicense
Feel free to contact the maintainers if that's a concern.