Skip to content

Commit a53c47b

Browse files
committed
HF_HUB_KEEP_LOCK_FILES to prevent rare concurrency issue
The lock file could be removed when the 2nd process gets the lock. And then the 3rd process will lock on a different lock file handle. Although the lock path stays the same. 1st proc gets the lock 2nd proc waits for the lock 1st proc releases the lock 1st proc removes the lock file 2nd proc gets the lock 3rd proc creates a new lock file and gets the lock Add ENV 'HF_HUB_KEEP_LOCK_FILES' to give user option to keep the lock files to prevent concurrency issues.
1 parent adef26d commit a53c47b

File tree

3 files changed

+17
-4
lines changed

3 files changed

+17
-4
lines changed

docs/source/en/package_reference/environment_variables.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -74,6 +74,12 @@ small files will be duplicated to ease user experience while bigger files are sy
7474

7575
For more details, see the [download guide](../guides/download#download-files-to-local-folder).
7676

77+
### HF_HUB_KEEP_LOCK_FILES
78+
79+
Lock files are used to ensure the downloading in parallel won't overwrite. By default, the lock file will be deleted
80+
automatically. However, the deletion would cause concurrency issues in rare cases. Set this to `True` to prevent this
81+
kind of concurrency issue.
82+
7783
## Boolean values
7884

7985
The following environment variables expect a boolean value. The variable will be considered

src/huggingface_hub/constants.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -130,6 +130,11 @@ def _as_int(value: Optional[str]) -> Optional[int]:
130130
_as_int(os.environ.get("HF_HUB_LOCAL_DIR_AUTO_SYMLINK_THRESHOLD")) or 5 * 1024 * 1024
131131
)
132132

133+
# Lock files are used to ensure the downloading in parallel won't overwrite. By default, the lock file will be deleted
134+
# automatically. However, the deletion would cause concurrency issues in rare cases. Set this to TRUE to prevent this
135+
# kind of concurrency issue.
136+
HF_HUB_KEEP_LOCK_FILES: bool = _is_true(os.environ.get("HF_HUB_KEEP_LOCK_FILES"))
137+
133138
# List frameworks that are handled by the InferenceAPI service. Useful to scan endpoints and check which models are
134139
# deployed and running. Since 95% of the models are using the top 4 frameworks listed below, we scan only those by
135140
# default. We still keep the full list of supported frameworks in case we want to scan all of them.

src/huggingface_hub/file_download.py

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,7 @@
2929
ENDPOINT,
3030
HF_HUB_DISABLE_SYMLINKS_WARNING,
3131
HF_HUB_ENABLE_HF_TRANSFER,
32+
HF_HUB_KEEP_LOCK_FILES,
3233
HUGGINGFACE_CO_URL_TEMPLATE,
3334
HUGGINGFACE_HEADER_X_LINKED_ETAG,
3435
HUGGINGFACE_HEADER_X_LINKED_SIZE,
@@ -1475,10 +1476,11 @@ def _resumable_file_manager() -> Generator[io.BufferedWriter, None, None]:
14751476
_chmod_and_replace(temp_file.name, local_dir_filepath)
14761477
pointer_path = local_dir_filepath # for return value
14771478

1478-
try:
1479-
os.remove(lock_path)
1480-
except OSError:
1481-
pass
1479+
if not HF_HUB_KEEP_LOCK_FILES:
1480+
try:
1481+
os.remove(lock_path)
1482+
except OSError:
1483+
pass
14821484

14831485
return pointer_path
14841486

0 commit comments

Comments
 (0)