GPUs Basel 2018
Here we describe on how to use the GPUs provided for the Basel Workshop 2018. We go through each step by using a simple tutorial dataset and project as an example. You can use the same steps described on your dataset/project.
The GPUs we use are located on the high performance computing cluster of the University of Basel called sciCORE (https://scicore.unibas.ch) which uses the SLURM queuing system. A queuing system coordinates the access to the GPUs and is needed when there are many users using just a few GPUs.
The main idea is that we create an alignment project locally, move it to the cluster on sciCORE and then run it using a pre-installed Dynamo standalone version on sciCORE. To do that, we follow the following steps:
On your local Matlab session with Dynamo loaded:
- Create the tutorial project:
dtutorial myParticles -p myProject -M 128
This creates a tutorial dataset with 128 particles in the directory myParticles
and a tutorial alignment project myProject
.
- Open the alignment project window:
dcp myProject
and under computing environment select GPU (standalone) as an environment.
- Check and Unfold the project.
- Before moving the data to sciCORE we have to compress the project. In the dcp gui go to Tools and then "create a tarball"
On your local Linux terminal:
- Copy the project data (particles) to sciCORE:
rsync -avuP myParticles USERNAME@login.bc2.unibas.ch:/scicore/home/PATH/dynamo_projects
- Copy the previously created tar file of the project to sciCORE:
rsync -avuP dTutorial.tar USERNAME@login.bc2.unibas.ch:/scicore/home/PATH/dynamo_projects
- Login to your sciCORE account:
ssh -Y USERNAME@login.scicore.unibas.ch
While logged in to your sciCORE account:
- Activate dynamo:
source PATH/dynamo_activate_linux_shipped_MCR.sh
- Untar the Dynamo project:
dynamo dvuntar myProject.tar
- Create a blank SLURM submission script (text file) named submit_job.sh:
nano submit_job.sh
- Copy and adapt the following lines into the newly created script:
For using the K80 GPUs:
#!/bin/bash -l # #SBATCH --job-name=dTest #SBATCH --qos=30min #SBATCH --time=00:60:00 #SBATCH --mem=16G #SBATCH --nodes=1 #SBATCH --ntasks-per-node=1 #SBATCH --cpus-per-task=1 #SBATCH --partition=k80 #SBATCH --gres=gpu:1 module load CUDA/7.5.18 source PATH/dynamo_activate_linux_shipped_MCR.sh cd PATH/dynamo_projects echo "dvput myProject -gpu_identifier_set $CUDA_VISIBLE_DEVICES" > dcommands.sh echo "dvunfold myProject" >> dcommands.sh dynamo dcommands.sh chmod u=rxw ./myProject.m ./myProject.m
For using the TitanX GPUs:
#!/bin/bash -l # #SBATCH --job-name=dTest #SBATCH --qos=emgpu #SBATCH --time=00:60:00 #SBATCH --mem=16G #SBATCH --nodes=1 #SBATCH --ntasks-per-node=1 #SBATCH --cpus-per-task=1 #SBATCH --partition=titanx #SBATCH --gres=gpu:1 module load CUDA/7.5.18 source PATH/dynamo_activate_linux_shipped_MCR.sh cd PATH/dynamo_projects echo "dvput myProject -gpu_identifier_set $CUDA_VISIBLE_DEVICES" > dcommands.sh echo "dvunfold myProject" >> dcommands.sh dynamo dcommands.sh chmod u=rxw ./myProject.m ./myProject.m
- Note that depending on your project you might have to adapt the project name and the time requested (time=) in the script.
- You can now run your alignment project by submitting the previously created script to SLURM with:
sbatch submit_job.sh
- To check your status in the queue type:
squeue -u USERNAME
To see all users in queue for the K80 GPU: squeue -q 30min
To see all users in queue for the TitanX GPU: squeue -q empgu
To cancel the job type scancel and the job ID that was shown by the squeue command: scancel my_job_id
Some ways to check the last output: ls -rtl tail -f slurm-45994509.out less slurm-45994509.out
To check the last average: dynamo ddb dTutorial:a -v