Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support path references external to sources for Docker builds in requires and requirement_installer_args #2081

Open
sarayourfriend opened this issue Dec 11, 2024 · 5 comments
Labels
enhancement New features, or improvements to existing features. linux The issue relates Linux support.

Comments

@sarayourfriend
Copy link
Contributor

sarayourfriend commented Dec 11, 2024

What is the problem or limitation you are having?

Once #2059 is merged and released, the requirement_installer_args option will be available. Local, relative paths used in the option (e.g., as an argument to --find-links) are automatically transformed into absolute paths, which enables seamless use for all build targets that build on the host. Docker, however, does not build on the host, so these path references are meaningless to it.

This also affects usages of local paths in requires, for which Briefcase handles local paths in a manner similar to requirement_installer_args.

Describe the solution you'd like

Add support for these by mounting referenced local paths into the container. Any solution for this will require tracking any local references found in requires or requirement_installer_args and exposing them in such a way that the Docker integration is able to include those mounts in calls to commands in the build container.

I can see two ways of actually mounting them:

  1. Mount the host paths to a location like /opt/{mount-name} in the container, and re-write any usage of those paths in the command arguments to reference the location in the container. Re-writing the arguments could be risky, but Briefcase already does this when it converts relative paths to absolute paths anyway, so it doesn't seem like this would really be too unique.
  2. Mount the host paths to the same location in the container, which avoids needing to re-write the actual command that is executed. It will appear as though the host's path is explicitly available to the container. However, this might also cause a subtle confusion in error messages. For example, if a command ends with an error about not being able to find a particular referenced path, and that path appears to be a host path rather than one obviously in the container's context, the path could present as a red herring, when the underlying issue is actually that the path doesn't exist anywhere at all.

The first option seems better to me, even though it's yet another level of argument re-writing. If the re-writing is implemented in DockerAppContext._dockerize_args itself, then it could work automatically for any path referenced on the host. In other words, it may be possible to implement this in such a way that it solves it for any command running on the host that happens to use a local path reference, rather than only for requires and requirement_installer_args. On the other hand, if more care is desired, a solution scoped specifically to just those two arguments should be explored. This is perhaps best decided in PR review, particularly if an answer isn't immediately obvious.

Describe alternatives you've considered

See above.

Additional context

A temporary workaround for requirement_installer_args is to avoid the path transformation in that option by using the = style for the argument (e.g., --find-links=./wheels) and then adding wheels to sources and cleanup_paths. This is just a workaround and shouldn't be relied on for long-term usage.

@sarayourfriend sarayourfriend added the enhancement New features, or improvements to existing features. label Dec 11, 2024
@freakboy3742 freakboy3742 added the linux The issue relates Linux support. label Dec 11, 2024
@sarayourfriend
Copy link
Contributor Author

The approach I suggest of transforming paths in dockerize_args is non-viable. Details in this comment: #2082 (comment)

I'll explore a different approach that constrains the transformation to only inputs from requires and requirement_installer_args.

@sarayourfriend
Copy link
Contributor Author

Exploring this a bit more today and wanted to share some observations/notes that will inform whatever solution gets implemented.

I've been revisiting the approach of doing this in dockerize_args, but there's really no where it's safe to do. Even if doing it in DockerAppContext and only mapping arguments that match relative local references in requires or requirement_installer_args is unsafe, because the potential for a collision still exists. The only way I can think to avoid that collision in dockerize_args is to introspect on args and see whether the call is a pip install, in which case Briefcase can say that the paths should be mapped, otherwise, don't try to do any mapping. I really don't like that approach. It's too implicit, and would be pretty easy to miss even reading the code... and further spreads the concern of pip install across Briefcase, when ideally it is becoming more constrained (#596).

Considering something more severe, then, I wonder if making "run the requirement installer" part of the app_context's concern altogether would be better. The default app_context would just call pip install... the way CreateCommand does today. DockerAppContext would build the mount map, transform arguments as needed, and then call through to app_context.run. Another app context could be introduced to handle the Flatpak and Android cases where requirements.txt and pip-options.txt need to be written. That would decouple all these concerns from CreateCommand. It doesn't help DevCommand much, but it currently doesn't deal with local paths at all (it probably shouldn't need to), so I don't think it's a blocker to extraction of "run the installer" into app context.

@freakboy3742 @mhsmith do y'all have any opinions on this or helpful insights?

@freakboy3742
Copy link
Member

I've been revisiting the approach of doing this in dockerize_args, but there's really no where it's safe to do. Even if doing it in DockerAppContext and only mapping arguments that match relative local references in requires or requirement_installer_args is unsafe, because the potential for a collision still exists.

I'm not sure I fully understand the risk of collision that you're envisaging. If there's a local file reference to /foo/bar/whiz in requires or requirement_installer_args, that can be mapped to <some_safe_path>/requirements_mounts/path1 - i.e., make a specific "mount point" where all "requires" mounts go inside the container, with the name of the mount point being uniquely assigned/generated as part of the mapping process. We can get fancy with the mapping name (using something more meaningful than path1), but even if it's a randomly assigned hash, it should work and be unique.

Is there a failure mode I'm not seeing here?

The only way I can think to avoid that collision in dockerize_args is to introspect on args and see whether the call is a pip install,

Yeah - completely agreed that's a change in the wrong direction.

Considering something more severe, then, I wonder if making "run the requirement installer" part of the app_context's concern altogether would be better. The default app_context would just call pip install... the way CreateCommand does today. DockerAppContext would build the mount map, transform arguments as needed, and then call through to app_context.run. Another app context could be introduced to handle the Flatpak and Android cases where requirements.txt and pip-options.txt need to be written. That would decouple all these concerns from CreateCommand. It doesn't help DevCommand much, but it currently doesn't deal with local paths at all (it probably shouldn't need to), so I don't think it's a blocker to extraction of "run the installer" into app context.

I think understand what you're driving at here... I'm not completely sure I see how this helps this problem with Docker path rewriting, though.

However, I can see how it would help with the "use conda"/"use uv". Those use cases will pretty much require factoring out the "run the requirement installer" part into a standalone interface that can then be selected based on a setting, instantiated as required, then passed into whatever command needs to "do the installing". So - this is likely a refactor that is worthwhile; if you can see a way that it improves the docker path mapping case, then all the better.

@sarayourfriend
Copy link
Contributor Author

The trouble with collision is if a relative local path that is intended to refer to something in the container (i.e., relative to workdir) but also happens to exist relative to the app's base path, and probably for some command other than pip install. I'm not entirely sure what that would be, but maybe something like an output path to a command to generate static assets that runs in the container.

The way the refactor I've proposed helps is by making explicit when the requirement installer is running, so that the app context can safely map paths, compared to any generic command passed to app_context.run where a relative local path might be intended to refer to a path in the container, rather than on the host. So path mapping would only happen for requirement installer, and other commands would be left alone.

Maybe this use case doesn't exist? If not, then mapping any relative local path ever passed to dockerize_args does technically work and is a relatively small uplift compared to the refactor.

If there's a local file reference to /foo/bar/whiz

I've realised I was assuming that only relative local paths should be mapped. But actually, if a local absolute path in requires should be mapped, then the risk of collision is even more likely and would happen with pip install itself. If /foo/bar/whiz exists in the docker container and the host, should the container's version be used? Is there a way for a user to indicate that intention? For example, if a user is always building in Docker to target a reproducible build, and is using a custom base image that comes pre-loaded with a dependency at a path in the container, but which also exists on their host, which version is correct to use? I'd think the container's version, but I could see it going either way depending on the user's intention.

Only mapping local relative paths seems to disambiguate this a bit.

Do these use cases even exist, though? Or am I worrying about edge cases that wouldn't ever actually happen?

@freakboy3742
Copy link
Member

The trouble with collision is if a relative local path that is intended to refer to something in the container (i.e., relative to workdir) but also happens to exist relative to the app's base path, and probably for some command other than pip install. I'm not entirely sure what that would be, but maybe something like an output path to a command to generate static assets that runs in the container.

I guess that's possible - but it also seems like something we can mitigate with documentation. If we explicitly document that certain locations inside the container are quarantined (e.g., /<some path>/mounts), then no command inside the container should be referencing that directory.

The way the refactor I've proposed helps is by making explicit when the requirement installer is running, so that the app context can safely map paths, compared to any generic command passed to app_context.run where a relative local path might be intended to refer to a path in the container, rather than on the host. So path mapping would only happen for requirement installer, and other commands would be left alone.

Ok - that makes sense. In terms of ordering the work, it almost sounds like it that refactor could be performed independently of this "fix the Docker paths" task.

Maybe this use case doesn't exist? If not, then mapping any relative local path ever passed to dockerize_args does technically work and is a relatively small uplift compared to the refactor.

If /foo/bar/whiz exists in the docker container and the host, should the container's version be used? Is there a way for a user to indicate that intention?

I'd argue this is the case we explicitly document as a not-supported case. If /<some path>/mounts is documented as being required be a mounted path, then there's no possibility of collision. And if that path is something inside the briefcase project directory, then we can be reasonably safe that the mounted path won't exist anyway.

Do these use cases even exist, though? Or am I worrying about edge cases that wouldn't ever actually happen?

Never say never :-) but I think I'd be comfortable putting this in the "known constraints of reality" bucket.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New features, or improvements to existing features. linux The issue relates Linux support.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants