-
-
Notifications
You must be signed in to change notification settings - Fork 645
Description
π feature request
Relevant Rules
py_binary, py_test (from @rules_python)
Description
When building Python binaries or tests using Bazel (specifically with py_binary and py_test),
we encounter an "Argument list too long" error during the build process.
This happens when our projects depend on a very large number of files, particularly those from large Python libraries managed by pip_parse (e.g., boto3, msgraph-sdk-python).
The root cause seems to be that Bazel calls the zipper tool by passing all file paths to be included in the package directly as command-line arguments.
rules_python/python/private/py_executable.bzl
Lines 891 to 957 in 9429ae6
| def _create_zip_file(ctx, *, output, original_nonzip_executable, zip_main, runfiles): | |
| """Create a Python zipapp (zip with __main__.py entry point).""" | |
| workspace_name = ctx.workspace_name | |
| legacy_external_runfiles = _py_builtins.get_legacy_external_runfiles(ctx) | |
| manifest = ctx.actions.args() | |
| manifest.use_param_file("@%s", use_always = True) | |
| manifest.set_param_file_format("multiline") | |
| manifest.add("__main__.py={}".format(zip_main.path)) | |
| manifest.add("__init__.py=") | |
| manifest.add( | |
| "{}=".format( | |
| _get_zip_runfiles_path("__init__.py", workspace_name, legacy_external_runfiles), | |
| ), | |
| ) | |
| for path in runfiles.empty_filenames.to_list(): | |
| manifest.add("{}=".format(_get_zip_runfiles_path(path, workspace_name, legacy_external_runfiles))) | |
| def map_zip_runfiles(file): | |
| if file != original_nonzip_executable and file != output: | |
| return "{}={}".format( | |
| _get_zip_runfiles_path(file.short_path, workspace_name, legacy_external_runfiles), | |
| file.path, | |
| ) | |
| else: | |
| return None | |
| manifest.add_all(runfiles.files, map_each = map_zip_runfiles, allow_closure = True) | |
| inputs = [zip_main] | |
| if _py_builtins.is_bzlmod_enabled(ctx): | |
| zip_repo_mapping_manifest = ctx.actions.declare_file( | |
| output.basename + ".repo_mapping", | |
| sibling = output, | |
| ) | |
| _py_builtins.create_repo_mapping_manifest( | |
| ctx = ctx, | |
| runfiles = runfiles, | |
| output = zip_repo_mapping_manifest, | |
| ) | |
| manifest.add("{}/_repo_mapping={}".format( | |
| _ZIP_RUNFILES_DIRECTORY_NAME, | |
| zip_repo_mapping_manifest.path, | |
| )) | |
| inputs.append(zip_repo_mapping_manifest) | |
| for artifact in runfiles.files.to_list(): | |
| # Don't include the original executable because it isn't used by the | |
| # zip file, so no need to build it for the action. | |
| # Don't include the zipfile itself because it's an output. | |
| if artifact != original_nonzip_executable and artifact != output: | |
| inputs.append(artifact) | |
| zip_cli_args = ctx.actions.args() | |
| zip_cli_args.add("cC") | |
| zip_cli_args.add(output) | |
| ctx.actions.run( | |
| executable = ctx.executable._zipper, | |
| arguments = [zip_cli_args, manifest], | |
| inputs = depset(inputs), | |
| outputs = [output], | |
| use_default_shell_env = True, | |
| mnemonic = "PythonZipper", | |
| progress_message = "Building Python zip: %{label}", | |
| ) |
This leads to the argument list exceeding the operating system's ARG_MAX limit.
This error makes our Bazel builds unstable and undermines the reliability of our CI/CD pipelines for large Python projects.
Describe the solution you'd like
We propose enhancing rules_python to always pass arguments to the zipper tool via a temporary response file, rather than directly on the command line.
Upon inspecting zipper's zip_main.cc source code, it appears to support reading arguments from a file using the @ syntax (as indicated by logic to process arguments starting with @).
By consistently utilizing this response file capability, rules_python can entirely bypass OS ARG_MAX limitations.