Difference between revisions of "GPUs Basel 2018"
Line 10: | Line 10: | ||
* Create the tutorial project: | * Create the tutorial project: | ||
<tt>dtutorial myParticles -p myProject -M 128</tt> | <tt>dtutorial myParticles -p myProject -M 128</tt> | ||
− | + | We now have a tutorial dataset with 128 particles in the directory <code>myParticles</code> and a tutorial alignment project <code>myProject</code>. | |
* Open the alignment project window: | * Open the alignment project window: | ||
Line 18: | Line 18: | ||
* Check and Unfold the project. | * Check and Unfold the project. | ||
− | * Before moving the data to sciCORE we have to compress the project. In the | + | * Before moving the data to sciCORE we have to compress the project. In the project window go to ''Tools'' and then "create a tarball". |
Line 44: | Line 44: | ||
<tt>nano submit_job.sh</tt> | <tt>nano submit_job.sh</tt> | ||
− | * Copy and adapt the following lines into the newly created script: | + | * Copy and adapt the following lines into the newly created script (note the difference between K80 and TitanX GPUs): |
For using the K80 GPUs: | For using the K80 GPUs: | ||
Line 92: | Line 92: | ||
* Note that depending on your project you might have to adapt the project name and the time requested (time=) in the script. | * Note that depending on your project you might have to adapt the project name and the time requested (time=) in the script. | ||
+ | * You can now run your alignment project by submitting the previously created script to SLURM with: | ||
+ | <tt>sbatch submit_job.sh</tt> | ||
− | * | + | * With the following commands you can check the overall status of the submitted jobs: |
− | |||
− | + | Check your status in the queue: | |
− | squeue -u USERNAME | + | <tt>squeue -u USERNAME</tt> |
− | + | See all users in queue for the K80 GPU: | |
− | squeue -q 30min | + | <tt>squeue -q 30min</tt> |
− | + | See all users in queue for the TitanX GPU: | |
− | squeue -q empgu | + | <tt>squeue -q empgu</tt> |
− | To cancel the job type ''scancel'' | + | To cancel the job type ''scancel'' followed by the job ID that was shown by the squeue command: |
− | scancel my_job_id | + | <tt>scancel my_job_id</tt> |
Some ways to check the last output: | Some ways to check the last output: | ||
− | ls -rtl | + | <tt>ls -rtl</tt> |
− | tail -f slurm-45994509.out | + | <tt>tail -f slurm-45994509.out</tt> |
− | less slurm-45994509.out | + | <tt>less slurm-45994509.out</tt> |
− | To check the last average: | + | To check the last average load the standalone Dynamo environment by typing <code>dynamo</code> into the terminal and use the usual commands, e.g.: |
− | + | <tt>ddb dTutorial:a -v</tt> | |
− | ddb dTutorial:a -v |
Revision as of 15:11, 20 August 2018
Here we describe on how to use the GPUs provided for the Basel Workshop 2018. We go through each step by using a simple tutorial dataset and project as an example. You can use the same steps described on your dataset/project.
The GPUs we use are located on the high performance computing cluster of the University of Basel called sciCORE (https://scicore.unibas.ch) which uses the SLURM queuing system. A queuing system coordinates the access to the GPUs and is needed when there are many users using just a few GPUs.
The main idea is that we create an alignment project locally, move it to the cluster on sciCORE and then run it using a pre-installed Dynamo standalone version on sciCORE. To do that, we follow the following steps:
On your local Matlab session with Dynamo loaded:
- Create the tutorial project:
dtutorial myParticles -p myProject -M 128
We now have a tutorial dataset with 128 particles in the directory myParticles
and a tutorial alignment project myProject
.
- Open the alignment project window:
dcp myProject
and under computing environment select GPU (standalone) as an environment.
- Check and Unfold the project.
- Before moving the data to sciCORE we have to compress the project. In the project window go to Tools and then "create a tarball".
On your local Linux terminal:
- Copy the project data (particles) to sciCORE:
rsync -avuP myParticles USERNAME@login.bc2.unibas.ch:/scicore/home/PATH/dynamo_projects
- Copy the previously created tar file of the project to sciCORE:
rsync -avuP dTutorial.tar USERNAME@login.bc2.unibas.ch:/scicore/home/PATH/dynamo_projects
- Login to your sciCORE account:
ssh -Y USERNAME@login.scicore.unibas.ch
While logged in to your sciCORE account:
- Activate dynamo:
source PATH/dynamo_activate_linux_shipped_MCR.sh
- Untar the Dynamo project:
dynamo dvuntar myProject.tar
- Create a blank SLURM submission script (text file) named submit_job.sh:
nano submit_job.sh
- Copy and adapt the following lines into the newly created script (note the difference between K80 and TitanX GPUs):
For using the K80 GPUs:
#!/bin/bash -l # #SBATCH --job-name=dTest #SBATCH --qos=30min #SBATCH --time=00:60:00 #SBATCH --mem=16G #SBATCH --nodes=1 #SBATCH --ntasks-per-node=1 #SBATCH --cpus-per-task=1 #SBATCH --partition=k80 #SBATCH --gres=gpu:1 module load CUDA/7.5.18 source PATH/dynamo_activate_linux_shipped_MCR.sh cd PATH/dynamo_projects echo "dvput myProject -gpu_identifier_set $CUDA_VISIBLE_DEVICES" > dcommands.sh echo "dvunfold myProject" >> dcommands.sh dynamo dcommands.sh chmod u=rxw ./myProject.m ./myProject.m
For using the TitanX GPUs:
#!/bin/bash -l # #SBATCH --job-name=dTest #SBATCH --qos=emgpu #SBATCH --time=00:60:00 #SBATCH --mem=16G #SBATCH --nodes=1 #SBATCH --ntasks-per-node=1 #SBATCH --cpus-per-task=1 #SBATCH --partition=titanx #SBATCH --gres=gpu:1 module load CUDA/7.5.18 source PATH/dynamo_activate_linux_shipped_MCR.sh cd PATH/dynamo_projects echo "dvput myProject -gpu_identifier_set $CUDA_VISIBLE_DEVICES" > dcommands.sh echo "dvunfold myProject" >> dcommands.sh dynamo dcommands.sh chmod u=rxw ./myProject.m ./myProject.m
- Note that depending on your project you might have to adapt the project name and the time requested (time=) in the script.
- You can now run your alignment project by submitting the previously created script to SLURM with:
sbatch submit_job.sh
- With the following commands you can check the overall status of the submitted jobs:
Check your status in the queue:
squeue -u USERNAME
See all users in queue for the K80 GPU:
squeue -q 30min
See all users in queue for the TitanX GPU:
squeue -q empgu
To cancel the job type scancel followed by the job ID that was shown by the squeue command:
scancel my_job_id
Some ways to check the last output:
ls -rtl tail -f slurm-45994509.out less slurm-45994509.out
To check the last average load the standalone Dynamo environment by typing dynamo
into the terminal and use the usual commands, e.g.:
ddb dTutorial:a -v