If you’ve gone down the road of building your own machine for “deep learning”, you may also have some sort of Jupyter Notebook server running on it. You may also be able to access that server when you’re not on your local network. The Jupyter Foundation have done a good job of documenting how to do this in a fairly secure way. But standalone Jupyter Notebook servers aren’t ideal if you want more than one person to be able to access your machine.
In my case, a couple users need access to a deep learning server and I like the idea of being able to quickly deploy elsewhere if needed. I don’t really need to go full K8s, so I decided to set up a JupyterHub server using Docker.
In order to be able to use JupyterHub in a Docker container and access your NVIDIA GPU, there are three high-level steps to complete:
- Install NVIDIA display drivers and CUDA onto your system
- Install Docker and NVIDIA Docker
- Build a Deep Learning Container and Tell JupyterHub to Use It
I’ll quickly go through the first two before addressing some pain points in the last one.
Installing Display Drivers and CUDA
My machine is running Ubuntu 18.04, and unfortunately that means some compatibility issues with CUDA 9.2. CUDA 10.0 installs just fine and it requires NVIDIA driver release
410. To install the NVIDIA display drivers:
sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt update
sudo apt install nvidia-396
Wait a second, why didn’t we install
410 just then? Well, at time of writing,
410 isn’t…accessible via the command line (?!) due to some logical reason on NVIDIA’s part or my ignorance of the
apt package manager. However, you can install
410 by loading the
Software & Updates program, clicking on
Additional Drivers and then selecting the
410 metapackage driver.
After that, it’s easy to follow the CUDA Installation Guide. Might as well install CuDNN while you’re at it.
Install Docker and NVIDIA Docker
This part is straightforward. Follow the official Docker install directions and then follow the directions to install NVIDIA Docker. You might consider adding yourself to the
docker group as well.
Build a Deep Learning Container and Tell JupyterHub to Use It
Here’s where it gets complicated. At this point, we are able to access our GPU from within a Docker container (cool!). Using JupyterHub, it’s easy to specify which container you’d like it to launch when a user logs in, but that container has a few competing requirements:
- It must have Jupyter/JupyterHub installed, and must have a suitable entrypoint in place
- The container itself must have the same version of CUDA installed as what’s on the host
- The container must also have any libraries/software you require
That doesn’t sound that difficult, but it rules out cool projects like Deepo and Darknet and complicates the usage of the NVIDIA Cloud containers.
In the middle of trying to solve this problem, I stumbled on a project called GPU-Jupyterhub. It appears to be a fork of
jupyterhub-deploy-docker with a user container that’s been modified to inherit from the NVIDIA CUDA container images on Docker Hub. Basically the author of GPU-Jupyterhub did the work of building a container with Jupyter/JupyterHub and lots of other goodies (Tensorflow, Tensor Board, Keras, PyTorch, others) installed.
Great! I forked that repository, removed some medical imaging libraries which I won’t use, tried to clean up dependencies, made sure user containers were spawned in the NVIDIA Docker runtime, added a place for users to access common data, and tried to make the install process a little easier. Also, due to my specific configuration, using GitHub as an authenticator is not an option, so I’m using the PAM authenticator. Here is how to set up JupyterHub:
- Clone my GPU JupyterHub repo
- Make a userlist file
jupyterhub_config.py as needed
- Add desired users and passwords to
Dockerfile.jupyterhub, example here—this is necessary to use PAM user/pass authentication
- Generate a value for
POSTGRES_PASSWORD in the file
- Generate an SSL certificate (through Let’s Encrypt if you’ve got some DNS A records pointing to your deep learning machine or by signing your own) and place it in a directory called
.ssl at the root level of the repository
- Create Docker networks and volumes for JupyterHub—examples in
- Edit the last volume in the
hub service in
docker-compose.yml file to point to a directory on the host with data you’d like users to share
- Build the base notebook while in the
docker build -t base-notebook-gpu .
- Build the Tensorflow notebook while in the
docker build -t hub-deep-learning-notebook-gpu .
- Build the JupyterHub system from the root of the repo:
docker-compose up --build
If everything went well (hahaha) you should be able to access your JupyterHub server at
If there’s an install step I forgot (very likely), please let me know. If you’d like to take on cleaning up that main container, you’re my hero.
There are definitely lots of improvements yet to be made and I know there are some weird things happening in the Dockerfiles, in the
conda environment, and basic configuration. That said, this will get you up and running.