-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
All computation actually happens on the head node #737
Comments
Running docker jobs on a compute node requires dockerd to run there as well. Running dockerd on a compute node is not straightforward (missing drivers, problems with iptables, etc.). I did finally succeed in starting dockerd on a compute node, and define a docker swarm with the head node and the single compute node. However, currently, trying to run a job on the compute node crashes it (needs a hw reset). This system could provide us with the advantages of docker, (traceability of software in pipelines) while being more robust (eliminate dockerd), more secure (no rights escalation, thus no need for docker_wrap), and more efficient (direct mount of sandbox directories, no copying into docker volumes needed) while working nicely with slurm (singularity runs a container on the machine it is invoked on). |
The current plan is to run docker on each compute node and distribute the images through a local registry. Each compute node will have a local hard drive that stores docker images and data volumes. This tutorial shows how to set up a local registry.
The documentation is also helpful. If we run a registry on the head node as
|
Steps to configure a local hard drive for the compute nodes:
I sometimes had to reboot before running |
Even with the local hard drive configured, docker still crashes the compute node. We opened a ticket with Penguin, and they suggested Singularity instead. If we can use a single image file for several container instances, then it seems like a good alternative to docker. |
Singularity seems to work fine on the compute nodes, and I can run several processes at the same time using a single image file. Installing Singularity on the compute nodes needed an |
Support both Docker and Singularity while we test conversions. Remove methods when removing a singularity container.
Also fix Singularity installation on Travis.
Works around a problem with launching Debian Singularity images on a compute node running a CentOS host system. Also add -n to all sudo calls, to avoid getting blocked by password prompts.
For now, Debian Singularity images are not supported. Convert dump_pipeline to use environment variables for configuration.
Presently, a docker run command, whether executed on a compute or head node, results in the container actually running on the head node (look for a docker-containerd-shim process) because it is launched by the single dockerd running there.
CodeResources
folder instead of creating a docker imagesingularity check foo.simg
before accepting an image file, based onImageField
.The text was updated successfully, but these errors were encountered: