We did a big update, as I mentioned previously, and after the update I found out some of the graphic cards were not anymore supported by the newest NVIDIA drivers for CentOS x64, at this moment NVIDIA-Linux-x86_64-418.43.run. This may make some sense, since the Quadro 4000 was released in November 2010, but on the other hand it is a perfectly fine graphic card able to do 3D and hook up to 3 monitors. In principle I don’t like to throw away working hardware…unless requested to do so 🙂
I experienced all these symptoms depending on what I do:
- you seem to have the drivers running but there is no output of nvidia-smi
- you get an output that tells you that there is no device compatible.
- there’s an ouput from nvidia-smi but GDM crashes
Sample GDM crash on my client “tiny” looks like this:
systemctl status gdm ● gdm.service - GNOME Display Manager Loaded: loaded (/usr/lib/systemd/system/gdm.service; enabled; vendor preset: enabled) Active: active (running) since XXX; 52s ago Process: 24578 ExecStartPost=/bin/bash -c TERM=linux /usr/bin/clear > /dev/tty1 (code=exited, status=0/SUCCESS) Main PID: 24575 (gdm) CGroup: /system.slice/gdm.service └─24575 /usr/sbin/gdm XXX tiny systemd: Starting GNOME Display Manager... XXX tiny systemd: Started GNOME Display Manager. XXX tiny gdm: GdmDisplay: display lasted 0.093784 seconds XXX tiny gdm: GdmDisplay: display lasted 0.031349 seconds XXX tiny gdm: GdmDisplay: display lasted 0.017635 seconds XXX tiny gdm: GdmDisplay: display lasted 0.016253 seconds XXX tiny gdm: GdmDisplay: display lasted 0.016001 seconds XXX tiny gdm: GdmDisplay: display lasted 0.017770 seconds XXX tiny gdm: GdmLocalDisplayFactory: maximum number of X display failures reached: check X server log
Above, XXX corresponds to the date. We check the X server log as suggested. It reads:
root@tiny ~ ## > tail /var/log/Xorg.0.log [ 344.316] ==== WARNING WARNING WARNING WARNING ================ [ 344.316] This server has a video driver ABI version of 24.0 that this driver does not officially support. Please check http://www.nvidia.com/ for driver updates or downgrade to an X server with a supported driver ABI. [ 344.316] ===================================================== [ 344.316] (EE) NVIDIA: Use the -ignoreABI option to override this check. [ 344.316] (II) UnloadModule: "nvidia" [ 344.316] (II) Unloading nvidia [ 344.316] (EE) Failed to load module "nvidia" (unknown error, 0) [ 344.316] (EE) No drivers available.
To get back GDM and a desktop environment for 418.43 and Quadro 4000 I tried uninstalling and installing again the 418.43 drivers, and to install and use lightdm instead of gdm. None of the solutions worked. Installing the previous drivers on the new kernel I end up with the message Unable to load the kernel module nvidia.ko. Obviously because of the new kernel, of course.
What next? Downgrade maybe to avoid the xorg crash? From NVIDIA, I downloaded and install the latest legacy drivers NVIDIA-Linux-x86_64-390.87.run and I got my desktop back. Yeah, you can say: “why didn’t you do that to start with?“. Very simple answer also: I want to have homogeneous installations, not one machine with drivers version 390.87, the other one with 418.43. But I need to live with the fact that we are not all the same. unfortunately 😦