Last modified May 18, 2020

Preparing a Kubernetes cluster for the use of GPUs

In order to have GPU instances running CoreOS we need to follow these steps to install and configure the right libraries and drivers on the host machine.

Installing

To install the chart locally:

$ helm install helm/kubernetes-gpu-app

Provide a custom values:

$ helm install helm/kubernetes-gpu-app -f values.yaml

Nvidia GPU drivers

The idea here it is run a pod in every worker node to download, compile and install the Nvidia drivers on Flatcar/CoreOS. It is a fork from Shelman Group but adding the pod security policy needed to work in hardened clusters.

It will create a daemon set which runs a bunch of different commands by node. At the end, it displays a successful message in case there is not trouble found.

$ kubectl logs -f $(kubectl get pod -l app="nvidia-driver-installer" --no-headers | head -n 1 | awk '{print $1}') -c nvidia-driver-installer
...
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 390.116                 Driver Version: 390.116                  |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla V100-SXM2...  Off  | 00000000:00:1E.0 Off |                    0 |
| N/A   47C    P0    43W / 300W |      0MiB / 16160MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
Finished installing the drivers.
Updating host's ld cache

Installing the device plugin

Instead of the official Nvidia device plugin, which requires a custom docker runtime, we choose to use Google’s approach. In short, this device plugin expects that all the Nvidia libraries needed by the containers are present under a single directory on the host (/opt/nvidia).

Same as before we deploy a daemon set in the cluster which will mount the /dev host path and the /var/lib/kubelet/device-plugin path to make available the GPU device to pods that request it. Pointing out the we passed a flag to the container to indicate where the Nvidia libraries and binaries has been installed in our host.

$ kubectl logs -f $(kubectl get pod -l k8s-app="nvidia-gpu-device-plugin" --no-headers | head -n 1 | awk '{print $1}')

When everything has gone as expected you should see some logs like

device-plugin started
Found Nvidia GPU "nvidia0"
will use alpha API
starting device-plugin server at: /device-plugin/nvidiaGPU-1544782516.sock
device-plugin server started serving
falling back to v1beta1 API
device-plugin registered with the kubelet
device-plugin: ListAndWatch start
ListAndWatch: send devices &ListAndWatchResponse{Devices:[&Device{ID:nvidia0,Health:Healthy,}],}

Considerations

Depend how your application make use of Nvidia driver you may need to mount the proper volume.

  volumes:
  - name: nvidia-libs
    hostPath:
      path: /opt/nvidia/lib64
      type: Directory```
  containers:
    ...
    volumeMounts:
    - mountPath: /opt/nvidia/lib64
      name: nvidia-libs

Extend shared library to contain the nvidia directory

    - name: LD_LIBRARY_PATH
      value: $LD_LIBRARY_PATH:/opt/nvidia/lib64

Taking into account that default Pod Security Policy in the tenant clusters, restricted, does not allow mount host paths. You will need to extend:

apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  name: your-psp-name
spec:
  ...
  volumes:
  - hostPath

Verification

To run a test we are going to use a cuda vecadd example. It performs a simple vector addition using the device plugin installed before.

$ kubectl apply -f https://raw.githubusercontent.com/giantswarm/kubernetes-gpu/master/manifests/test-pod.yaml

If we inspect the logs, we should be able to see something like

$ kubectl  logs -f cuda-vector-add
Begin
Allocating device memory on host..
Copying to device..
Doing GPU Vector add
Doing CPU Vector add
10000000 0.000007 0.046845

Now you have successfully installed everything needed to run GPU workloads over your Kubernetes cluster.


You can collaborate on this recipe on GitHub.