This is a technical article about how to get CUDA passthrough working in a particular Linux container implementation, LXC. We will demonstrate GPU passthrough for LXC, with a short CUDA example program.

Linux containers can be used for many things. We are going to set up something which is like a light-weight virtual machine. This can then be used to help with clean builds, testing, or to help with deployment. Linux containers can also support limiting resource access, resource prioritization, resource accounting and process freezing.

Container implementation

Linux containers are built on two features in the Linux kernel, cgroups, https://en.wikipedia.org/wiki/Cgroups, and namespace isolation. There are several projects building on these kernel features in order to make them a bit easier to use. We are going to use one called LXC. You can read about it here: https://linuxcontainers.org/.

Howto

Here is how to get CUDA working in a container on Ubuntu Server 15.04. First install Ubuntu server 15.04 in the usual way.

Update, and install LXC

We want to use the latest LXC:

sudo add-apt-repository ppa:ubuntu-lxc/lxc-stable

Then we can update the system like this:

sudo apt-get update
sudo apt-get upgrade
sudo apt-get install lxc

After doing this, it is probably a good idea to reboot, to avoid the possibility of having issues connected to the systemd upgrade bug mentioned in the sidebar.

Install Nvidia CUDA driver

Next we’ll install the Nvidia driver on the host operating system. We will install from the Nvidia driver .run installer.

You will probably have an issue where the Nouveau kernel module has been loaded by Ubuntu. We don’t want this because it conflicts with the Nvidia driver kernel module.

Let’s fix this issue. Create this file, ‘/etc/modprobe.d/nvidia-installer-disable-nouveau.conf’, with these contents:

blacklist nouveau
options nouveau modeset=0

Then we should reboot so that we are running without the Nouveau module loaded.

Here is where you can get the driver from Nvidia: http://www.nvidia.com/object/unix.html.

I used Linux x86_64/AMD64/EM64T – Latest Long Lived Branch version: 352.21, which has the filename ‘NVIDIA-Linux-x86_64-352.21.run’. This driver is compatible with CUDA 7.0.

In order to install the driver from source we’ll need gcc and make.

sudo apt-get install gcc make

Then install it:

sudo sh ./NVIDIA-Linux-x86_64-352.21.run

We don’t care about the Xorg stuff on a server, when the installer asks about it just ignore it or tell the installer to do nothing.

You can check that CUDA is working on the host machine at this point, by installing the CUDA SDK and compiling and running a simple CUDA program. There is an example program at the bottom of this post. There is also a precompiled exe linked at the bottom of the post which might work on your system and you can avoid having to install the CUDA SDK on the host at this time.

Prepare for unprivileged containers

We are going to run an unprivileged container. This means our container will be created and run under our normal user and not under the root user. We need to do a little manual set up to make this work:

edit /etc/lxc/lxc-usernet and add the line:

your-username veth lxcbr0 10

Replace your-username with the user you are using. This is to support networking to the container.

Do the following:

mkdir -p ~/.config/lxc
cp /etc/lxc/default.conf ~/.config/lxc/default.conf

and add these lines to ‘~/.config/lxc/default.conf’:

lxc.id_map = u 0 100000 65536
lxc.id_map = g 0 100000 65536

These should match the numbers in /etc/subuid, /etc/subgid for your user.

Create the container

lxc-create -t download -n mycontainer -- --dist ubuntu --release vivid --arch amd64

Add the Nvidia devices to the container, edit the file ‘~/.local/share/lxc/mycontainer/config’ and add these lines to the bottom:

lxc.mount.entry = /dev/nvidia0 dev/nvidia0 none bind,optional,create=file
lxc.mount.entry = /dev/nvidiactl dev/nvidiactl none bind,optional,create=file
lxc.mount.entry = /dev/nvidia-uvm dev/nvidia-uvm none bind,optional,create=file

Setup the container for access via ssh, and with a normal user which can use sudo:

lxc-start -n mycontainer
lxc-attach -n mycontainer

After running lxc-attach, the console you are on is a root prompt in the container. Run

apt-get install openssh-server
adduser myuser
usermod -a -G sudo myuser

Use ctrl-d to exit the container back to the host system.

We can get the IP address of the container so we can log in with ssh:

lxc-info -n mycontainer

Use the IP address from the output of lxc-info in the two following commands.

First copy the Nvidia driver installer into the container:

scp NVIDIA-Linux-x86_64-352.21.run [email protected]:

Log into the container:

ssh 10.0.3.333 -l myuser

The basic driver setup includes adding a kernel module, and adding a bunch of .so files and a few extra bits. Inside the container we don’t want to try to add the kernel module since we are using the host kernel with the module already loaded. We can install without the kernel module like this:

sudo sh ./NVIDIA-Linux-x86_64-352.21.run --no-kernel-module

(We don’t need to install g++ or make in the container to run this because we are not installing the kernel module.)

At this point, if you have a simple CUDA test exe, you can scp it into the container and check it runs OK.

Now you can install the CUDA SDK using the .run from Nvidia inside the container, or copy your CUDA binaries into the container and you are good to go. If you install the CUDA SDK from the .run, make sure you don’t try to install/upgrade/replace the Nvidia driver during the CUDA SDK installation. You can use something like this:

# make sure we have the prerequisites to install the SDK, and g++ so
# we can use the SDK
sudo apt-get update
sudo apt-get install perl-modules g++
# install the sdk without the driver
./cuda_7.0.28_linux.run -toolkit -toolkitpath=~/cuda-7.0 -silent -override

You probably don’t want to do normal development in a container, and you definitely want to avoid leaving the CUDA SDK or g++, make, etc. installed on either the host or container for production. One good use for installing the CUDA SDK into a container is to create a convenient way to do repeatable production builds of your CUDA exes.

I have a container already

If you already have an LXC container, you can do the following:

  • make sure you have Nvidia driver installed in the host system, and the /dev files have the right permissions
  • edit the container config file to add entries for the Nvidia devices
  • restart the container to make the Nvidia devices appear in it
  • install the Nvidia driver in the container without the kernel module
  • install the CUDA SDK without the driver or use your CUDA binaries in the container

This should work for privileged containers also.

For non-LXC containers, you will need to figure out how to make the Nvidia device files on the host available in the container, and to install the Nvidia drivers in the host and install them in the container without the kernel module, or just expose these files from the host.

Notes

Maybe you want to try running the container on something other than Ubuntu 15.04.

You can install the latest stable LXC release from source on your distribution of choice, install Nvidia driver on the host system, then create a container as above. On different systems, the big difference is likely to be in the networking setup for the container. Also, on some systems you will have to add some entries to the /etc/subuid and /etc/subgid files.

One thing you have to be aware of is I think the Nvidia driver files (.so files etc.) have to match the kernel module version, so you need to make sure the versions are exactly the same in the host and the container. This might be tricky e.g. if you install Nvidia driver on the host using the host packaging system, then try to run a different Linux distribution in the container. The CUDA SDK version doesn’t need to match the Nvidia driver version, it just needs to be a compatible version. Running a CUDA program will tell you if the Nvidia driver you have is compatible with your CUDA exe or not.

The other issues are the possible problems with Nvidia permissions on the host (easily solved), and the device/permissions issues mentioned in the sidebar above.

CUDA example test executable

Here is a small CUDA test program which can be used to check if CUDA is working on a system. The expected output is:

16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46

If there is a problem, you will almost certainly get an error message, so you shouldn’t need to go through the output to make sure the numbers match!

Paste the below into hellocuda.cu and then compile with

nvcc hellocuda.cu -o hellocuda

You have to have the CUDA SDK installed to compile this.

#include <iostream>

using namespace std;

void _cudaCheck(cudaError_t err, const char *file, int line) {
   if (err != cudaSuccess) {
       cerr << "cuda error: " << cudaGetErrorString(err)
            << file << line << endl;
       exit(-1);
   }
}
#define cudaCheck(ans) { _cudaCheck((ans), __FILE__, __LINE__); }


__global__  void add(int *a, int *b, int *c)
{
   c[threadIdx.x] = a[threadIdx.x] + b[threadIdx.x];
}


int main()
{
    const int N = 16;

    int a[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
    int b[N] = {16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31};

    const size_t sz = N * sizeof(int);
    int *da;
    cudaCheck(cudaMalloc(&da, sz));
    int *db;
    cudaCheck(cudaMalloc(&db, sz));
    int *dc;
    cudaCheck(cudaMalloc(&dc, sz));

    cudaCheck(cudaMemcpy(da, a, sz, cudaMemcpyHostToDevice));
    cudaCheck(cudaMemcpy(db, b, sz, cudaMemcpyHostToDevice));

    add<<<1, N>>>(da, db, dc);
    cudaCheck(cudaGetLastError());

    int c[N];
    cudaCheck(cudaMemcpy(c, dc, sz, cudaMemcpyDeviceToHost));

    cudaCheck(cudaFree(da));
    cudaCheck(cudaFree(db));
    cudaCheck(cudaFree(dc));

    for (unsigned int i = 0 ; i < N; ++i) {
        cout << c[i] << " ";
    }
    cout << endl;
    return 0;
}