Rebuild for pytorch 2.8#125
Conversation
|
Hi! This is the friendly automated conda-forge-linting service. I just wanted to let you know that I linted all conda-recipes in your PR ( |
|
This is running into issues arising from conda-forge/cudnn-feedstock#124 |
|
@h-vetinari isn't that cudnn reference about segfaults, these seem to be failing to due to failing to install deps? |
|
Yes, because the pytorch builds have a cudnn constraint to avoid those segfaults, which is causing the resolution errors here. |
|
@conda-forge-admin please restart ci |
|
The cudnn pin was removed in conda-forge/pytorch-cpu-feedstock#425, so now the failure are unrelated. Neverthless, could it make sense to limit the |
9c7a1c5 to
000d3de
Compare
Added a step to remove conflicting Torch headers during cross-compilation.
recipe/build.sh
Outdated
|
|
||
| # remove build prefix headers that conflict wtih cross compilation | ||
| rm -rf $BUILD_PREFIX/venv/lib/python3.*/site-packages/torch/include/torch/csrc/api/include/ |
There was a problem hiding this comment.
As crazy as this sounds, I think this is actually a great solution.
I took a crack at solving this issue, and every path ended up blocked...
- Not having libtorch in build deps (still in host deps)
OSError: $BUILD_PREFIX/venv/lib/python3.12/site-packages/torch/lib/libtorch_global_deps.so: cannot open shared object file: No such file or directory - Not having libtorch in host deps (still in build deps)
$BUILD_PREFIX/bin/../lib/gcc/aarch64-conda-linux-gnu/14.3.0/../../../../aarch64-conda-linux-gnu/bin/ld: cannot find -ltorch_cuda: No such file or directory $BUILD_PREFIX/bin/../lib/gcc/aarch64-conda-linux-gnu/14.3.0/../../../../aarch64-conda-linux-gnu/bin/ld: skipping incompatible $BUILD_PREFIX/venv/lib/python3.12/site-packages/torch/lib/libtorch_cuda.so when searching for -ltorch_cuda $BUILD_PREFIX/bin/../lib/gcc/aarch64-conda-linux-gnu/14.3.0/../../../../aarch64-conda-linux-gnu/bin/ld: skipping incompatible $BUILD_PREFIX/lib/libtorch_cuda.so when searching for -ltorch_cuda - Having libtorch in both:
error: redefinition of 'class torch::OrderedDict<Key, Value>'
Unless we split libtorch into libtorch and libtorch-dev (with the latter only including headers), I'm not sure how else we can solve this...
There was a problem hiding this comment.
@h-vetinari are you ok with this?
I was travelling and didn't see this before it got merged. I think it's not a lasting solution TBH. For a couple of reasons
include/torch/csrc/api/include/only has headers, no implementation files- those headers should be cross-platform
- they may not have correct guards against re-inclusion
Whatever of whoever is looking up those includes needs to check in $PREFIX/include first; then the problem should go away AFAIU
There was a problem hiding this comment.
Ok. Happy to look into it. But it may take until the new year. Things are really starting to pile up for me.
There was a problem hiding this comment.
python build-locally.py --debug
[...]
rattler-build currently doesn't support debug mode
This is sadly impossible to debug now....
There was a problem hiding this comment.
i think your analysis is ok, but it ignores the fact that some1 is adding both paths.
Thus there will be one definition from the ${PREFIX} inclusion and the other from the ${CONDA_PREFIX}.
I'm also not ignoring hte fact that many do decide to try actively make certain headers NOT cross platform, and not compatible across compilation options (HDF5 for example).
The header in question, has the right #pragma guard.
https://github.com/pytorch/pytorch/blob/v2.8.0/torch/csrc/api/include/torch/ordered_dict.h
Its a little tiring to debug pytorch's lack of willingness to accomodate cross platform.
A 1 line patch pytorch/pytorch#137084 took like 8 months to merge. Even after it was agreed that it was a good idea.
I think rf -rf for certain header files is acceptable in many situations.
This package also blocks other migrations, so lets use this as a small boost to lessen our immediate backlog.
|
Hi! This is the friendly conda-forge automerge bot! I considered the following status checks when analyzing this PR:
Thus the PR was passing and merged! Have a great day! |
* updated v2.2.6.post3 * MNT: Re-rendered with conda-smithy 3.52.3 and conda-forge-pinning 2025.10.08.21.44.2 * Re-enable import checks Builds now use pytorch=2.8.0 which pin to triton=3.4.0 at https://github.com/conda-forge/pytorch-cpu-feedstock/blob/034ea6435e4a6c60add135a388b65ee43ef11165/recipe/meta.yaml#L19, so imports should work without a GPU. * Rebuild for pytorch 2.8 * Remove build prefix headers that conflict with cross compilation Hack adapted from conda-forge/torchvision-feedstock#125 (comment) * Remove .postX suffix from recipe version name Adapted from conda-forge/flash-attn-feedstock@9a8aaac#diff-ad6aed43f0cf479348ccccec8e7185c4b70582192329f11908dd22a731771702R8-R10 --------- Co-authored-by: Wei Ji <23487320+weiji14@users.noreply.github.com>
This PR has been triggered in an effort to update pytorch28.
Notes and instructions for merging this PR:
Please note that if you close this PR we presume that the feedstock has been rebuilt, so if you are going to perform the rebuild yourself don't close this PR until the your rebuild has been merged.
If this PR was opened in error or needs to be updated please add the
bot-rerunlabel to this PR. The bot will close this PR and schedule another one. If you do not have permissions to add this label, you can use the phrase@conda-forge-admin, please rerun botin a PR comment to have theconda-forge-adminadd it for you.This PR was created by the regro-cf-autotick-bot. The regro-cf-autotick-bot is a service to automatically track the dependency graph, migrate packages, and propose package version updates for conda-forge. Feel free to drop us a line if there are any issues! This PR was generated by https://github.com/regro/cf-scripts/actions/runs/18191818574 - please use this URL for debugging.