Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DKMS 3.1.4 breaks the no-autoinstall behaviour and return no functional parameters during compilation. #491

Open
wildtruc opened this issue Feb 8, 2025 · 11 comments

Comments

@wildtruc
Copy link

wildtruc commented Feb 8, 2025

In this exemple, we are talking about Nvidia drivers with both open and proprietary models precompiled in dkms /var/lib.
For this to work, it is needed to install both drivers in different dirs and rename for open version:

  • /var/lib/dkms/nvidia
  • /var/lib/dkms/open-nvidia

To prevent auto install of proprietary driver at kernel update overwriting open-nvidia, the config empty files /etc/dkms/no-autoinstall is mandatory, and the option AUTOINSTALL=yes is set in the dkms conf file of the current used driver and unset in the unused.

DKMS 3.1.4 doesn't recognize or take care of no-autoinstall file and return weird variables values line 'standard_output' during compilation and finally breaks with "fatal error" concluding with a crash of the xserver.

Downgrading to 3.0.13 solved the problem.

@anbe42
Copy link
Collaborator

anbe42 commented Feb 8, 2025

Please try again with 3.1.5.
What distribution are you using?
Please paste the commands you used and full verbatim output showing the errors you are talking about.

/etc/dkms/no-autoinstall is not intended to be used by any dkms module. It's an option intended for CI purposes and for the local admin to disable dkms entirely.

Where do you get the dkms modules from? Which script manages the AUTOINSTALL=yes setting?

@wildtruc
Copy link
Author

wildtruc commented Feb 9, 2025 via email

@wildtruc
Copy link
Author

wildtruc commented Feb 9, 2025

Edit:
Fedora has just update dkms to 3.1.5. So I did test for compilation first in command line and then at boot time.

  • command line do not report any issues. No more unknown variable appears.
  • boot time update failed with no-autoinstall file, the correctly success with the file removed.

Then, where the no-autoinstall file was needed with 3.0.x to accomplish the script process, this is no more needed with the fixed 3.1.5.
Good job :)
I will update my script to take care of this change.

@scaronni
Copy link
Collaborator

scaronni commented Feb 9, 2025

Hi @wildtruc, I guess you are not using the packages provided in the CUDA repository? Because there kmod-nvidia-latest-dkms conflicts with kmod-nvidia-open-dkms and viceversa. Why are you installing both kernel module packages and not just one between open or closed? They're definitely not meant to be installed together.

I've changed substantially the packaging in the CUDA repository for 570, please give it a go on Fedora 41 if you can (as Fedora is not supported).

If you prefer a different packaging with both open and closed modules in the one package there's always my private repository which still has Fedora 40 support:

@wildtruc
Copy link
Author

wildtruc commented Feb 9, 2025

No, @scaronni, I don't. :)
If I like to have both it's because I like the idea to switch when ever I want to test and see progress between both, because of gaming essentially.
I use my own script for years, Zenvidia, coming from the time when there was even no kmod, etc. The only one allowing me to switch on a simple session restart. :)
Plus, it allow me to not take care of any repos support.

@anbe42
Copy link
Collaborator

anbe42 commented Feb 9, 2025

If you use dkms install, both /etc/dkms/no-autoinstall and AUTOINSTALL are irrelevant.

So if your script manages the module (with dkms install) and you want to prevent the dkms autoinstall job at boot time from interfering with that, you should have AUTOINSTALL="" (empty string means "no") in both dkms.conf and you shouldn't have /etc/dkms/no-autoinstall.

You should also be able to override the dkms.conf coming with the module source by placing AUTOINSTALL="" in /etc/dkms/{nvidia,open-nvidia}.conf

@wildtruc
Copy link
Author

wildtruc commented Feb 10, 2025 via email

@wildtruc
Copy link
Author

wildtruc commented Feb 10, 2025

Test done.
AUTOINSTALL="yes"/"" works as no option at all works in the conf file.
On the other hand, there is no mention in the man page.
Here the paragraph about it:
"dkms_autoinstaller
This boot-time service automatically installs any module which has AUTOINSTALL="yes" set in its dkms.conf file. The service works quite simply and if multiple versions of a module are in your system's DKMS tree, it will not do anything and instead explain that manual intervention is required."
Nothing else.

@wildtruc
Copy link
Author

DKMS man page need an update, but I didn't find who is maintaining it. All link for this are welcome.
The conclusion is that AUTOINSTALL desactivation works with both ways:

  • mention of AUTOINSTALL=""
  • no mention at all.
    Case closed.

@anbe42
Copy link
Collaborator

anbe42 commented Feb 16, 2025

So the intuitive way would be supporting AUTOINSTALL="no" (in addition to an empty value) to disable autoinstall.
And to error out on unknown AUTOINSTALL values.

@wildtruc
Copy link
Author

Yes, I think so too.

  • No mention of AUTOINSTALL for the already existing dkms.conf.
  • AUTOINSTALL="" or "no" for disabling.
  • AUTOINSTALL="yes" for enabling.
  • AUTOINSTALL="poopoo" reject.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants