Install CUDA and Nvidia Driver in Linux

Yasen Hu

Sep 7, 2020 2 min read Misc

Introduction

The purpose of this post is to log and share my experience when installing the CUDA and Nvidia driver. The Nvidia driver may be easily messed up if you accidentally upgrade or downgrade your CUDA version.

Recently I updated my PyTorch version with following commands:

conda install pytorch torchvision cudatoolkit=10.2 -c pytorch

Then I am not able to login my Linux system anymore, it gets stuck in a login loop which indicates there maybe something wrong with the Nvidia driver. The issues was resolved by reinstalling the CUDA and Nvidia driver again.

Prerequisite

CUDA-Capable GPU
Supported version of Linux
GCC installed

More details can be found from the Nvidia official document.

Install CUDA and Nvidia driver

Download CUDA toolkit https://developer.nvidia.com/cuda-downloads
Switch to virtual console by pressing “Ctrl+Alt+F1”

Disabling Nouveau

# The Nouveau drivers are loaded if the following command prints anything:
$ lsmod | grep nouveau

# Create a file at /etc/modprobe.d/blacklist-nouveau.confs:
blacklist nouveau
options nouveau modeset=0

# Regenerate the kernel initramfs:
$ sudo update-initramfs -u

Disable X server and run installation file from there. Important Note: user should add --run-nvidia-xconfig option to tell the driver installation to run nvidia-xconfig to update the system X configuration file, so that the NVIDIA X driver is used. Otherwise the system may gets stuck in a login loop after reboot!
```
$ sudo service lightdm stop
# Remember to run Nvidia xconfig
$ sudo ./cuda_10.2.89_440.33.01_linux.run --run-nvidia-xconfig
# User may need to remove old driver, follow instructions from run file.
```
Post-installation actions: environment setup. The PATH variable needs to include /usr/local/cuda/bin. The LD_LIBRARY_PATH variable needs to contain /usr/local/cuda/lib64 on a 64-bit system, and /usr/local/cuda-7.5/lib on a 32-bit system
```
$ sudo vim /etc/profile.d/myenvvars.sh
export PATH=/usr/local/cuda/bin${PATH:+:${PATH}}

$ sudo vim /etc/ld.so.conf.d/mylib.conf
/usr/local/cuda/lib64
```

Verify the driver version

$ cat /proc/driver/nvidia/version
# CUDA Toolkit version
$ nvcc -V

Install Third-party Libraries

$ sudo apt-get install g++ freeglut3-dev build-essential libx11-dev \
    libxmu-dev libxi-dev libglu1-mesa libglu1-mesa-dev

Install cuDNN if needed

$ cd folder/extracted/contents
$ sudo cp include/cudnn.h /usr/local/cuda/include
$ sudo cp -a lib64/libcudnn* /usr/local/cuda/lib64

Check CUDA/cuDNN versions

# Check CUDA version
$ nvcc --version
# Check cuDNN version
$ cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2

References