Exploring CUDA and cuDNN and Installation Methods

Just started playing with stable diffusion, but using the integrated package, I only have a partial understanding of many training knowledge. I learn and use it as I go, and I can create some works that I want. But recently, I learned that upgrading pyTorch and xformer can greatly improve the speed of training lora, which made me want to upgrade (~~how can I not fully utilize the newly acquired 4070~~). So, I started the journey of stepping on pitfalls - actually just the title.

00328-1123083227-blue poison (arknights), 1girl, breasts, pink hair, solo, blue eyes, jacket, swimsuit, bikini, blue jacket, large breasts, white

Before upgrading PyTorch, you need to check your CUDA and cuDNN versions. Most graphics cards support CUDA 11, and most SD-webui installations of pyTorch also support CUDA 11, so no changes are needed. But if you have a 40 series graphics card, it probably supports CUDA 12. Although it is fine to use CUDA 11, CUDA 12 can greatly improve the computational efficiency of the graphics card, so I recommend upgrading to CUDA 12 if possible. How do you determine whether your computer has CUDA 11 or CUDA 12? What is cuDNN?

I had the same question at first, and even thought that I had installed CUDA 12, but later I found out after researching.

Basic Concepts#

First, you need to know what CUDA and cuDNN are. Although I only have a partial understanding of the specific details, we still need to have a general concept, at least knowing what they are, what they do, and how to use them. According to the official documentation:

The NVIDIA® CUDA® Toolkit provides a comprehensive development environment for C and C++ developers building GPU-accelerated applications. With the CUDA Toolkit, you can develop, optimize, and deploy your applications on GPU-accelerated embedded systems, desktop workstations, enterprise data centers, cloud-based platforms, and HPC supercomputers. The toolkit includes GPU-accelerated libraries, debugging and optimization tools, a C/C++ compiler, and a runtime library to deploy your application.
Using built-in capabilities for distributing computations across multi-GPU configurations, scientists and researchers can develop applications that scale from single GPU workstations to cloud installations with thousands of GPUs.

In simple terms, it is a development framework that helps us better utilize the computing power of GPUs for top-level development.

The official explanation of cuDNN is:

The NVIDIA® CUDA Deep Neural Network (cuDNN) library is a GPU-accelerated library of primitives for deep neural networks. cuDNN provides highly tuned implementations for standard routines such as forward and backward convolution, pooling, normalization, and activation layers. Deep learning researchers and framework developers worldwide rely on cuDNN for high-performance GPU acceleration.
Deep learning researchers and framework developers worldwide rely on cuDNN for high-performance GPU acceleration. It allows them to focus on training neural networks and developing software applications rather than spending time on low-level GPU performance tuning. cuDNN accelerates widely used deep learning frameworks, including Chainer, Keras, MATLAB, MxNet, PaddlePaddle, PyTorch, and TensorFlow.

In simple terms, it is a dependency library for deep learning frameworks, used to accelerate neural network training.

pyTorch is a python-based deep learning framework, and using cuDNN can accelerate the training speed. It is important to note that the versions of pyTorch and CUDA must be consistent, otherwise compatibility issues may cause program errors, which will be mentioned below.

Preparations#

First, you need to determine the CUDA version supported by your computer. You can use the following command in the command prompt to check (this is the NVIDIA command line, so as long as the graphics card driver is installed on a Linux system, it should be able to be used):

nvidia-smi

You can get the specific graphics card model in your computer:

The above information includes your hardware and software information, including the name and specifications of the graphics card, driver version, running processes, and the supported CUDA version displayed in the upper left corner (Note: the displayed CUDA version is the one supported by your graphics card, not the one installed on your computer!)

If you have never installed CUDA before or have never heard of CUDA, then it is 99% likely that it is not installed in your environment (although it seems like a redundant statement, but I didn't think of it at first). If you want to further check if it is installed, you can use the following command in the command prompt:

nvcc -V

If it is installed, it will display as shown below:

So the question is, since pyTorch requires CUDA to run, but I have never installed CUDA before, how did my integrated package run?
Answer: pyTorch has built-in some basic runtime dependencies for CUDA. When CUDA is not present in your environment, it will call them.

Also, please confirm the version of pyTorch in your environment and the version you want to upgrade to. This is very important for installing the corresponding version of CUDA later. You can enter your Python environment and execute the following command to check the version of pyTorch:

import torch
# Output the version of pyTorch
print(torch.__version__)
# Output the version of CUDA referenced by torch. Note that if CUDA is not installed in your environment, it will still output, but it refers to the version of the built-in CUDA runtime in torch.
print(torch.version.cuda)
# Output the version of cuDNN referenced by torch, the specific rules are the same as the CUDA mentioned above.
print(torch.backends.cudnn.version())

Installation#

First, you need to know the version of pyTorch and the corresponding CUDA version required. There are many resources available online, and one way is to directly look at the naming convention of the installation package, through the following link.

However, there is a very important point
CUDA 12 does not appear in the version compatibility, but in some unofficial channels, developers have stated that the latest pyTorch supports CUDA 12, and only needs to replace some runtime files. The specific method will be mentioned in the following section on adapting pyTorch.

Updating torch is easy, just upgrade it using pip.

CUDA Installation#

Next, let's talk about the installation of CUDA.

First, go to the official website to download the software package. Here, we take the example of the win10 system.

After downloading, start the installation directly:

The following is the location of the cached files during installation. Just put them in a location with enough space, but note that this directory will be deleted when the installation is complete.

Here, please note that since the installation size of CUDA is relatively large, and we often already have a certain GPU environment when installing it, it is recommended to customize the installation and install only what is needed.

Please note that the Driver Component is generally not selected here because it involves graphics card drivers, and we usually upgrade the graphics card drivers separately (I upgraded them through NVIDIA Experience). If the installed CUDA version is not the latest, there is a high probability of overwriting the new driver with the old driver, so it is not recommended to select it.

You can open CUDA to see some specific components, such as Visual Studio components. If you don't use them in your daily development, you don't need to select them.

Installation in progress...

Since my computer does not have Visual Studio installed, some dependencies will not be installed successfully during the installation, but as long as it does not affect the use, it is fine.

Verification
You can go to the following directory in the command prompt to check if these test programs can run successfully and to check if the CUDA version is normal:

cd DIR\NVIDIA GPU Computing Toolkit\CUDA\v12.1\extras\demo_suite
bandwidthTest.exe
deviceQuery.exe

cuDNN Installation#

Since cuDNN is just a set of dependency library files, we only need to download the files of the specified version and replace the corresponding directory in CUDA.
Note that you need to register for an NVIDIA account, but fortunately, only an email is required to create an account...
Here is the download link

After downloading, you will have the following folders, just replace them in the CUDA directory.

Adapting pyTorch#

This mainly applies to the adaptation of CUDA 12. To use CUDA 12 with pyTorch, you need to replace some files in the dependency library directory of pyTorch with the files from CUDA.

The library directory of torch is usually in the corresponding python lib directory (or the lib directory of virtualenv), similar to:

${venv}\Lib\site-packages\torch\lib

The files that need to be replaced are shown in the figure below:

The first three are CUDA files, and the last seven are cuDNN files. You should be able to easily find them in the CUDA directory.

After successful installation, you can enjoy the high-speed experience brought by CUDA 12!!!

References#

CUDA Official Documentation
cuDNN Official Documentation