Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

All computation actually happens on the head node #737

Closed
13 tasks done
wrpscott opened this issue Jun 19, 2018 · 5 comments
Closed
13 tasks done

All computation actually happens on the head node #737

wrpscott opened this issue Jun 19, 2018 · 5 comments
Assignees
Labels

Comments

@wrpscott
Copy link
Contributor

wrpscott commented Jun 19, 2018

Presently, a docker run command, whether executed on a compute or head node, results in the container actually running on the head node (look for a docker-containerd-shim process) because it is launched by the single dockerd running there.

  • experiment with local registry in VirtualBox
  • test local registry on Bulbasaur (registry worked, running didn't)
  • experiment with Singularity
  • create Singularity image files in CodeResources folder instead of creating a docker image
  • add removal plan for containers and their families
  • check that a container's permissions are less than its family's permissions
  • run singularity check foo.simg before accepting an image file, based on ImageField.
  • replace docker launch with Singularity launch (@wrpscott already started)
  • dockerlib.SingularityDockerHandler.docker_is_alive() assumes an existing docker and local repo installation.
  • support Docker and Singularity during transition
  • add management command to convert docker images to singularity containers
  • convert all the current docker containers to Singularity
  • set Singularity container when adding or revising a method
@wrpscott wrpscott added the bug label Jun 19, 2018
@wrpscott
Copy link
Contributor Author

Running docker jobs on a compute node requires dockerd to run there as well. Running dockerd on a compute node is not straightforward (missing drivers, problems with iptables, etc.). I did finally succeed in starting dockerd on a compute node, and define a docker swarm with the head node and the single compute node. However, currently, trying to run a job on the compute node crashes it (needs a hw reset).
Docker swarm essentially reproduces a lot of functionality we already have in slurm.
As we today experienced serious issues with the stability of dockerd on our production machine (requiring a reboot), it might be worth trying to find a way to omit the use of dockerd.
I came across the singularity project, which can run docker images as a subprocess of the current user's shell (and also as that user, not as root). It does not need the docker runtime i.e. dockerd.

This system could provide us with the advantages of docker, (traceability of software in pipelines) while being more robust (eliminate dockerd), more secure (no rights escalation, thus no need for docker_wrap), and more efficient (direct mount of sandbox directories, no copying into docker volumes needed) while working nicely with slurm (singularity runs a container on the machine it is invoked on).

@donkirkby
Copy link
Member

The current plan is to run docker on each compute node and distribute the images through a local registry. Each compute node will have a local hard drive that stores docker images and data volumes. This tutorial shows how to set up a local registry.

docker service create --name registry --publish=5000:5000 \
 --constraint=node.role==manager \
 --mount=type=bind,src=/home/docker,dst=/certs \
 -e REGISTRY_HTTP_ADDR=0.0.0.0:5000 \
 -e REGISTRY_HTTP_TLS_CERTIFICATE=/certs/registry.crt \
 -e REGISTRY_HTTP_TLS_KEY=/certs/registry.key \
 registry:latest

The documentation is also helpful.

If we run a registry on the head node as kive-int.cfenet.ubc.ca:5000, and push each image to the local registry as part of the build process, then docker can fetch images from the local repository when we launch a job. I think it could be as simple as adding the registry address before the image names:

docker_wrap.py --sudo --inputs /path/to/sandbox/names.csv \
  --output /path/to/sandbox/ -- \
  kive-int.cfenet.ubc.ca:5000/my-image:v1.0 sandbox1 \
  my_command /mnt/input/names.csv /mnt/output/greetings.csv

@donkirkby
Copy link
Member

donkirkby commented Jul 12, 2018

Steps to configure a local hard drive for the compute nodes:

  1. List the current partitions:

     bpsh 0 parted -l
     bpsh 0 blkid
     bpsh 0 lsblk
    
  2. Turn off swap so you can resize the swap partition: sudo bpsh 0 swapoff /dev/sda1

  3. Unmount the local partition if you want to resize it: sudo bpsh 0 umount /dev/sda2

  4. This step assumes you don't need to keep any data on the local hard drive. Back it up if you want to keep it. Repeat this step with each partition number given by parted -l.

     sudo bpsh 0 /usr/sbin/parted /dev/sda rm 1
    
  5. Create a swap partition, considering the Red Hat guidance for swap size.

     sudo bpsh 0 /usr/sbin/parted /dev/sda mkpart primary 'linux-swap(v1)' 1 4GB
     sudo bpsh 0 /usr/sbin/mkswap /dev/sda1
     sudo bpsh 0 /usr/sbin/swapon /dev/sda1
    
  6. Create an ext4 partition on the rest of the disk.

     sudo bpsh 0 /usr/sbin/parted /dev/sda mkpart -- primary ext4 4GB -1
     sudo bpsh 0 /usr/sbin/mkfs -text4 /dev/sda2
     sudo bpsh 0 mkdir -p /media/local
     sudo bpsh 0 mount /dev/sda2 /media/local
    
  7. Configure the two new partitions in /etc/beowulf/fstab. The nonfatal option lets a partition get ignored on compute nodes that don't have one.

     /dev/sda1		swap            swap    nonfatal        0 0
     /dev/sda2		/media/local		ext4	nonfatal		0 0
    

I sometimes had to reboot before running mkswap or mkfs, because the new partition didn't show up in /dev.

@donkirkby
Copy link
Member

Even with the local hard drive configured, docker still crashes the compute node. We opened a ticket with Penguin, and they suggested Singularity instead. If we can use a single image file for several container instances, then it seems like a good alternative to docker.

@donkirkby
Copy link
Member

Singularity seems to work fine on the compute nodes, and I can run several processes at the same time using a single image file.

Installing Singularity on the compute nodes needed an fstab entry, as well as a script in /etc/beowulf/init.d to create an empty folder at /var/singularity/mnt/final.

@donkirkby donkirkby added this to the 0.12 simpler pipeline setup milestone Jul 26, 2018
donkirkby added a commit that referenced this issue Aug 1, 2018
donkirkby added a commit that referenced this issue Aug 2, 2018
donkirkby added a commit to cfe-lab/kive-default-docker that referenced this issue Aug 7, 2018
@donkirkby donkirkby self-assigned this Aug 8, 2018
donkirkby added a commit to cfe-lab/MiCall that referenced this issue Aug 9, 2018
donkirkby added a commit that referenced this issue Aug 9, 2018
Support both Docker and Singularity while we test conversions.
Remove methods when removing a singularity container.
donkirkby added a commit that referenced this issue Aug 10, 2018
Also fix Singularity installation on Travis.
donkirkby added a commit that referenced this issue Sep 20, 2018
Works around a problem with launching Debian Singularity images on a compute node running a CentOS host system.
Also add -n to all sudo calls, to avoid getting blocked by password prompts.
donkirkby added a commit that referenced this issue Sep 25, 2018
For now, Debian Singularity images are not supported.
Convert dump_pipeline to use environment variables for configuration.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants