-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
symlinking of crates causes issues with macos sandbox #482
Comments
Hi @j-baker thanks for the report! As of today, downloading and unpacking crates happens in two derivations which is undoubtedly contributing to the increase in total derivation count. Sadly we cannot fold these into a single step because unpacking the tarball would result in a different hash (i.e. not the one in One thing we could do, is change Possible workaround ideas in the meantime:
|
Hi, thanks for the reply. I realised I was a little unclear in my previous message. Here is what my understanding is. In step 1, Crane downloads crates. Each crate gets a derivation. The input store paths for this derivation are ~=
In step 2, Crane extracts these crates:
In step 3, Crane 'vendors' these crates per registry.
In step 4, Crane builds the inputs cargo dir.
And from this point on, actual cargo commands run. When the sandbox for each build is built, it is granted access to all transitive dependencies, as these are the totality of what might be depended on. On Linux this would refer to bind mounting the paths into the sandbox. The two phase download&extract I don't believe contributes to the problem, because there is no transitive dependency passed on. The problem I believe I'm facing is with sufficient input crates from a registry, the number of store paths that the output depends on becomes much too large for MacOS. One brute force 'fix' to this problem is j-baker@2087e8b. This is not cost free - it converts symlinking of directories into a directory traversal, but it is a oneliner, so it has that going for it. This only works because I while my total sandbox size is too large, the sandbox size contributed by any single registry is not over the size limit for me, right now. There are many levels of sophistication one could apply to reduce the likelihood of hitting this problem without adding runtime cost, however many of them likely lead to unnecessary complexity. My sense however is that one workable solution (that'd be probably a few lines of code on top of what currently exists) is:
This would therefore kind of combine the extraction and vendoring steps. It would reduce the number of inputs to any one derivation. |
Fixes ipetkov#482. MacOS has trouble with derivations which (directly or transitively) have many buildInputs. Crane at present creates a build structure in which a given cargo command will transitively depend on numCrates nix store paths. This means that Crane fails to build projects with over about 600 crate dependencies on MacOS if the sandbox is enabled. This MR utilises a tiering approach to improve this. Each registry is assigned to a shard based on the hash of the crate name. If there are <32 crates in a registry there is one shard, if <2048 there are 16 shards, otherwise 256. Crates are directly extracted into these shard derivations rather than symlinking. What this means is: 1. Crane will not create a vendoring derivation with many inputs unless a project has a truly crazy number of dependencies. 1. No downstream cargo derivation will have many inputs either.
Hi!
I build on MacOS with the Nix sandbox enabled. This is because I run a MacOS build worker which pushes into a company-shared cache; I want to isolate builds so as to make it as hard as possible for one malicious user to poison the cache.
The MacOS sandbox definition that Nix uses contains all the store paths, and has a relatively low max size which clearly is somewhere in the 500-800 nix store paths region.
I have a project with around 700 crate dependencies. Due to the cargo vendor process symlinking, this means that there are > num dependency crates store paths in any cargo build derivation, which places a limit on the number of crates one can depend on.
I'm wondering if this project would consider copying crates instead of symlinking? Happy to make an MR which makes the change.
The downside would be slightly greater disk usage, the benefit would be that bigger projects can be built on Mac with sandboxing!
The text was updated successfully, but these errors were encountered: