Ubuntu20.04+CUDA11.2+PyTorch1.8.1+cu111がRTX3090のGPUで動作した !

RTX3090搭載のPCをゲットできたので、セットアップ。少しはまったけど無事PyTorchがRTX3090のGPUで動作したのでその手順をまとめておきます。

Ubuntu20.04のインストール

これは特に特別なことは必要なく、Ubuntu 20.04 LTSインストールガイド【スクリーンショットつき解説】あたりを参考にして普通にインストール。

NVIDIA driverが自動的にインストールされるようで、最初からRTX3090のDP経由でX Windowがモニター出力されました。ただ、ドライバーの所在がよくわからず、

$ sudo apt install ubuntu-drivers-common
$ ubuntu-drivers devices
$ nvidia-smi

これらも動作しなかったのですが、あまり深くは調べませんでした。

NVIDIA driverのインストール

最初はごく普通に、

$ sudo add-apt-repository ppa:graphics-drivers/ppa
$ sudo apt update
$ apt search nvidia-driver*
$ sudo apt install nvidia-driver-460

とするもエラーで動作せず、NVIDIA RTX 3090 Ubuntu 20.04 Driver Installation Guide あたりを見ながら、

$ sudo apt install nvidia-driver-460 nvidia-settings

でもエラーが吐かれるも、そのコメントから、

$ sudo apt install nvidia-driver-460 nvidia-settings --fix-missing

で成功。nvidia-smiも動作。

$ nvidia-smi    
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.67       Driver Version: 460.67       CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce RTX 3090    Off  | 00000000:01:00.0  On |                  N/A |
|  0%   45C    P8    27W / 350W |    644MiB / 24234MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

CUDA11.2のインストール

CUDA Toolkit 11.2 Update 2 Downloads にしたがって、普通にインストール。

$ wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-ubuntu2004.pin
$ sudo mv cuda-ubuntu2004.pin /etc/apt/preferences.d/cuda-repository-pin-600
$ sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/7fa2af80.pub
$ sudo add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/ /"
$ sudo apt-get update
$ sudo apt-get -y install cuda

.bashrcにも追記。

export CUDA_HOME=/usr/local/cuda
export PATH=$PATH:$CUDA_HOME/bin
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$CUDA_HOME/lib64

source .bashrc からの nvcc-V も正常動作。

$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Sun_Feb_14_21:12:58_PST_2021
Cuda compilation tools, release 11.2, V11.2.152
Build cuda_11.2.r11.2/compiler.29618528_0

cuDNN v.8.1.0のインストール

cuDNN Archive から当該パッケージをダウンロード&展開し、/usr/local/cuda以下にコピー。

$ tar xzvf cudnn-11.2-linux-x64-v8.1.0.77.tgz
$ sudo cp cuda/include/cudnn.h /usr/local/cuda/include
$ sudo cp -P cuda/lib64/libcudnn /usr/local/cuda/lib64
$ sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn

PyCharm環境の整備

グッディーは、PyCharm + venv でだいたいまかなってしまう（あとは依存関係とかで困ったらDockerで解決）ので、その下準備。詳細は省略しますが、

$ sudo add-apt-repository ppa:deadsnakes/ppa
$ sudo apt update
$ sudo apt install python3.9
$ sudo apt install python3.9-distutils

$ tar -xzvf pycharm-xxxx.tar.gz

あたりで準備完了。

動作確認

とりあえずPyCharmのSettingsでtorchの最新版をインストール。Python interpreterはこんな感じ。

PyTorchでGPU情報を確認（使用可能か、デバイス数など） を参考に以下のようなプログラムを走らせてみると…

import torch

print(torch.__version__)
print(torch.cuda.is_available())
print(torch.cuda.device_count())
print(torch.cuda.current_device())
print(torch.cuda.get_device_name())
print(torch.cuda.get_device_name(torch.device('cuda:0')))
print(torch.cuda.get_device_name('cuda:0'))

 1.8.1+cu102
 True
 1
 0
 GeForce RTX 3090
 GeForce RTX 3090
 GeForce RTX 3090

 /home/work/PycharmProjects/test/venv/lib/python3.9/site-packages/torch/cuda/init.py:104: UserWarning: 
 GeForce RTX 3090 with CUDA capability sm_86 is not compatible with the current PyTorch installation.
 The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_70.
 If you want to use the GeForce RTX 3090 GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/

 warnings.warn(incompatible_device_warn.format(device_name, capability, " ".join(arch_list), device_name))

 Process finished with exit code 0

プログラムは意図したとおりに動作するも、何やら怒られている。仰せのとおりに https://pytorch.org/get-started/locally/ に行ってみると、

なるほどー。ちなみにCUDA10.2だと pip install torch torchvision torchaudio でよいそうなので、つまりデフォルトでインストールされるtorch1.8.1は1.8.1+cu102なのね。1.8.1+cu112はないのかしらと調べてみるもないらしい。

仕方がないのでダメ元でtorch1.8.1+cu111をインストールしてみる。PyCharmのSettingsからはこの特別バージョンはインストールできないっぽいので、コンソールからやってみる。

$ source ./venv/bin/activate

(venv)$ pip install torch==1.8.1+cu112 -f https://download.pytorch.org/whl/torch_stable.html
Looking in links: https://download.pytorch.org/whl/torch_stable.html
ERROR: Could not find a version that satisfies the requirement torch==1.8.1+cu112
ERROR: No matching distribution found for torch==1.8.1+cu112

(venv)$ pip install torch==1.8.1+cu111 -f https://download.pytorch.org/whl/torch_stable.html
Looking in links: https://download.pytorch.org/whl/torch_stable.html
Collecting torch==1.8.1+cu111
（省略）
Successfully installed torch-1.8.1+cu111

やっぱり1.8.1+cu112はないと怒られるが、1.8.1+cu111はインストールはできた。CUDAバージョンは違うけど。PyCharmのSettingsもOK。

ということで先ほどのプログラムを走らせてみると… あれ ? エラーが出ない !?

ということで GeForce RTX 3080 with CUDA capability sm_86 is not compatible with the current PyTorch installation. #45028 の中盤あたりにあるtest.pyを持ってきて走らせてみる。

import time

import torch
import torch.nn as nn
import torch.backends.cudnn as cudnn

cudnn.benchmark = True
cuda = torch.device("cuda")

def sync():
  torch.cuda.synchronize()

def bench(f):
  sync()
  start = time.perf_counter()
  f()
  torch.cuda.synchronize()
  end = time.perf_counter()
  return end - start

x = torch.randn(32, 16, 512, 512, device=cuda)

class Test(nn.Module):
  def __init__(self):
    super().__init__()
    self.c = nn.Conv2d(16, 16, 3, padding=1)

  def forward(self, x):
    return self.c(x)

m = Test().to(cuda)

def run():
  o = m(x)
  return o

# warmup
for _ in range(10):
  diff = bench(run)
  print(diff)
print('warmup done')

# test
total = 0
for _ in range(1000):
  diff = bench(run)
  total += diff

print(f'average inference time (s) = {total / 1000}')

なんか動いてるっぽい。最後のforループを増やしてnvidia-smiを見てみると…

$ nvidia-smi      
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.67       Driver Version: 460.67       CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce RTX 3090    Off  | 00000000:01:00.0  On |                  N/A |
| 53%   51C    P2   345W / 350W |   8028MiB / 24234MiB |     99%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1156      G   /usr/lib/xorg/Xorg                102MiB |
|    0   N/A  N/A      1740      G   /usr/lib/xorg/Xorg                338MiB |
|    0   N/A  N/A      2164      G   /usr/bin/gnome-shell               74MiB |
|    0   N/A  N/A      6451      G   …AAAAAAAAA= --shared-files       158MiB |
|    0   N/A  N/A     43662      C   …ects/test/venv/bin/python      7335MiB |
+-----------------------------------------------------------------------------+

99%でブン回ってる。動作した !

PyTorch Tutorial その3 – Neural Network

kernel_initializerって学習の収束に大事かも

PyTorchの学習済みmodel fileの拡張子が.pth.tarではまった話

PyTorch Tutorial その4 – Training Classifier

PyTorchで突然malloc(): invalid next size (unsorted)が出たときの対処

PyTorch Tutorial その2 – torch.autograd

Ubuntu20.04+CUDA11.2+PyTorch1.8.1+cu111がRTX3090のGPUで動作した !

Ubuntu20.04のインストール

NVIDIA driverのインストール

CUDA11.2のインストール

cuDNN v.8.1.0のインストール

PyCharm環境の整備

動作確認

関連記事

コメントを残すコメントをキャンセル

人気記事一覧

最近の投稿

アーカイブ

カテゴリー

Home

Profile

Activities

ブログコンテンツ

ソーシャルメディア

Ubuntu20.04のインストール

NVIDIA driverのインストール

CUDA11.2のインストール

cuDNN v.8.1.0のインストール

PyCharm環境の整備

動作確認

関連記事

コメントを残す コメントをキャンセル

人気記事一覧

最近の投稿

アーカイブ

カテゴリー

タグ

コメントを残すコメントをキャンセル