This is a brief records of pitfalls and problems we met on GPU environment configuration.
Useful toolkit
1 2 3 4 5 6 7 8
# check the cuda version $ cat /usr/local/cuda/version.txt
# check the cudnn version $ cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2
# check the GPU driver version cat /proc/driver/nvidia/version
Ubuntu
Reinstall GPU driver
1 2 3 4 5 6 7 8 9 10 11 12 13
# Remove old drivers $ sudo apt-get purge nvidia-*
$ sudo add-apt-repository ppa:graphics-drivers/ppa $ sudo apt-get update # If unsure about the driver version, run `ubuntu-drivers devices` to figure out $ sudo apt-get install nvidia-<driver-version>
$ reboot $ nvidia-smi
# open nvidia settings -> PRIME Profile nvidia-settings
CUDA
Ubuntu
1 2 3 4 5 6 7 8 9 10 11 12
# Download & install cuda 10.1 $ wget https://developer.nvidia.com/compute/cuda/10.1/Prod/local_installers/cuda-repo-ubuntu1604-10-1-local-10.1.105-418.39_1.0-1_amd64.deb
pip install --upgrade pip # Installs the wheel compatible with CUDA 11 and cuDNN 8.2 or newer. pip install --upgrade "jax[cuda]" -f https://storage.googleapis.com/jax-releases/jax_releases.html # Note: wheels only available on linux.
Check if gpu works:
1 2 3 4 5 6 7
>>> import jax >>> jax.devices()
# or
from jax.lib import xla_bridge print(xla_bridge.get_backend().platform)
Tensorflow
1 2 3
# Install # This command will include all required packages including compatible cuda and cudnn. $ conda create --name tf_gpu tensorflow-gpu[=1.15]
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
# Check if tensorflow-gpu works # Build a graph. >> a = tf.constant(5.0) >> b = tf.constant(6.0) >> c = a * b
# Launch the graph in a session. >> sess = tf.compat.v1.Session() # v2 >> sess = tf.Session() # v1
# Evaluate the tensor `c`. >> print(sess.run(c))
# Or, only for v1: >> import tensorflow as tf >> sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))