Difference between revisions of "GPUs Basel 2018"

From Dynamo
Jump to navigation Jump to search
(Created page with "Here we describe on how to use the GPUs provided for the Basel Workshop 2018. We go through each step by using a simple tutorial dataset/project as an example. You can use the...")
 
 
(10 intermediate revisions by the same user not shown)
Line 1: Line 1:
Here we describe on how to use the GPUs provided for the Basel Workshop 2018. We go through each step by using a simple tutorial dataset/project as an example. You can use the same steps on your project of choice.
+
Here we describe how to use the GPUs provided for the Basel Workshop 2018. We go through each step and use a simple tutorial dataset and project as an example. You can use the same steps described here on your own project.
  
GPUs are on cluster SCICORE (link) and use queing system SLURM (a queuing system arranges the tasks from different users that want to use the same gpu)
+
The GPUs we use are located on the high performance computing cluster of the University of Basel called sciCORE (https://scicore.unibas.ch) which uses the SLURM queuing system. A queuing system coordinates the access to the GPUs and is needed when there are many users using a limited amount of GPUs. You have been given the credentials needed to log in on the cluster at the beginning of the workshop.
  
We differ from 3 spaces: lcoal matlab, local shell, scicore shell
+
The main idea is that we create an alignment project locally, move it to the cluster on sciCORE and then run it using a pre-installed Dynamo standalone version on sciCORE. How you can do that is described in the following steps.
  
On your local Matlab session with dynamo loaded:
 
1) Create tutorial project: dtutorial pTutorial -p dTutorial -M 128
 
2) select gpu as computing environement (default id, will be changed later)
 
3) unfold
 
4) tar project: in dcp gui go to tools and then create tarball
 
  
On local shell:
+
'''On your local Matlab session with Dynamo loaded:'''
5) download standalone dynamo
 
6) copy it to scicore:
 
rsync -avuP dynamo-v-1.1.333_MCR-9.2.0_GLNXA64_withMCR.tar scaramuz@login.bc2.unibas.ch:/scicore/home/stahlberg/scaramuz/dynamo_standalone
 
7) copy project data (particles) to scicore:
 
rsync -avuP pTutorial scaramuz@login.bc2.unibas.ch:/scicore/home/stahlberg/scaramuz/dynamo_scicore_testing
 
8) copy tar of project to scicore:
 
rsync -avuP dTutorial.tar scaramuz@login.bc2.unibas.ch:/scicore/home/stahlberg/scaramuz/dynamo_scicore_testing
 
9) login to scicore:
 
ssh -Y scaramuz@login.scicore.unibas.ch
 
  
On scicore:
+
* Create a tutorial project with Dynamo:
10) untar stnadalone dynamo:
+
<tt>dtutorial myParticles -p myProject -M 128</tt>
tar -xf dynamo-v-1.1.333_MCR-9.2.0_GLNXA64_withMCR.tar
+
We now have a tutorial dataset with 128 particles in the directory <code>myParticles</code> and a tutorial alignment project <code>myProject</code>.
11) compile cuda executables:
 
    module load CUDA/7.5.18
 
    go to /cuda directory of matlab standalone folder
 
    which nvcc (should be correct version)
 
    source config.sh
 
  make clean
 
    make all
 
  
12) put mcr cache root on scratch???
+
* Open the alignment project window:
13) activate dynamo:
+
<tt>dcp myProject</tt>
source dynamo_activate_linux_shipped_MCR.sh
+
and under ''computing environment'' select ''GPU (standalone)'' as an environment.
14) untar dynamo project:
 
dynamo dvuntar dTutorial.tar
 
15) create slurm script "dTest_submit_k80_ID.sh":
 
  
#!/bin/bash -l
+
* Check and unfold the project.
 +
 
 +
* Before moving the data to sciCORE we have to compress the project. In the project window go to ''Tools'' and then ''create a tarball''.
 +
 
 +
 
 +
'''On your local Linux terminal:'''
 +
 
 +
* Open a local Linux terminal and navigate to the directory where you just created the tutorial dataset and project. Copy the project data (particles) to sciCORE:
 +
<tt>rsync -avuP myParticles USERNAME@login.scicore.unibas.ch:/scicore/home/s-gpu-course/USERNAME/dynamo_projects</tt>
 +
 
 +
* Copy the previously created tar file of the project to sciCORE:
 +
<tt>rsync -avuP myProject.tar USERNAME@login.scicore.unibas.ch:/scicore/home/s-gpu-course/USERNAME/dynamo_projects</tt>
 +
 
 +
* Login to your sciCORE account:
 +
<tt>ssh -Y USERNAME@login.scicore.unibas.ch</tt>
 +
If asked to continue type "yes". Use the provided password.
 +
 
 +
 
 +
'''While logged in to your sciCORE account:'''
 +
 
 +
* While logged in to your sciCORE account, activate dynamo:
 +
<tt>source /scicore/home/s-gpu-course/GROUP/dynamo_activate_linux_shipped_MCR.sh</tt>
 +
 
 +
* Go to the location where you copied the data:
 +
<tt>cd dynamo_projects </tt>
 +
 
 +
* Untar the Dynamo project:
 +
<tt>dynamo </tt>
 +
<tt>dvuntar myProject.tar </tt>
 +
<tt>exit </tt>
 +
 
 +
* Create a blank SLURM submission script (text file) named ''submit_job.sh'':
 +
<tt>nano submit_job.sh</tt>
 +
 
 +
* Copy (and adapt) the following lines into the newly created script. Depending on your project you might have to adapt the project name and the time requested (''time=hh:mm:ss'') in the script. If your job will run longer than 30 minutes, set the ''qos'' to ''6hours'' and adapt the time to anything between 30 minutes and 6 hours:
 +
 
 +
<nowiki>#!/bin/bash -l
 
#
 
#
 
#SBATCH --job-name=dTest
 
#SBATCH --job-name=dTest
#SBATCH --qos=30min       (for titanX: emgpu)
+
#SBATCH --qos=30min
#SBATCH --time=00:60:00 (adapt time)
+
#SBATCH --time=00:30:00
 
#SBATCH --mem=16G
 
#SBATCH --mem=16G
 
#SBATCH --nodes=1
 
#SBATCH --nodes=1
 
#SBATCH --ntasks-per-node=1
 
#SBATCH --ntasks-per-node=1
 
#SBATCH --cpus-per-task=1
 
#SBATCH --cpus-per-task=1
#SBATCH --partition=k80  (for titanX: titanx)
+
#SBATCH --partition=pascal
 
#SBATCH --gres=gpu:1
 
#SBATCH --gres=gpu:1
 +
#SBATCH --reservation=dynamo
 
module load CUDA/7.5.18
 
module load CUDA/7.5.18
source /scicore/home/stahlberg/scaramuz/dynamo_standalone/dynamo_activate_linux_shipped_MCR.sh
+
source /scicore/home/s-gpu-course/GROUP/dynamo_activate_linux_shipped_MCR.sh
cd /scicore/home/stahlberg/scaramuz/dynamo_scicore_testing/
+
cd $HOME/dynamo_projects
echo "dvput dTutorial -gpu_identifier_set $CUDA_VISIBLE_DEVICES" > commands.sh
+
echo "dvput myProject -gpu_identifier_set $CUDA_VISIBLE_DEVICES" > dcommands.sh
echo "dvunfold dTutorial" >> commands.sh
+
echo "dvunfold myProject" >> dcommands.sh
dynamo commands.sh
+
dynamo dcommands.sh
chmod u=rxw ./dTutorial.m
+
chmod u=rxw ./myProject.exe
./dTutorial.m
+
./myProject.exe</nowiki>
 +
 
  
16) launch job on slurm with:
+
* You can now run your alignment project by submitting the previously created script to SLURM with:
sbatch dTest_submit_k80_ID.sh
+
<tt>sbatch submit_job.sh</tt>
  
17) check queue:
+
* With the following commands you can check the overall status of the submitted jobs:
squeue -u scaramuz
 
  
see all users in queue:
+
Check your status in the queue:
squeue -q 30min (for titanX: squeue -q empgu)
+
<tt>squeue -u USERNAME</tt>
  
 +
See all users in the queue:
 +
<tt>squeue -q 30min</tt>
  
18) cancel job:
 
scancel ????? (job id given from squeue command)
 
  
19) check last output:
+
To cancel the job type ''scancel'' followed by the job ID that was shown by the squeue command:
ls -rtl
+
<tt>scancel my_job_id</tt>
tail -f slurm-45994509.out
 
less slurm-45994509.out
 
  
20) check last average
+
Some ways to check the last output:
dynamo
+
<tt>ls -rtl</tt>
ddb dTutorial:a -v
+
<tt>tail -f slurm-45994509.out</tt>
 +
<tt>less slurm-45994509.out</tt>
  
21) interactive session for testing:
+
To check the last average load the standalone Dynamo environment by typing <code>dynamo</code> into the terminal and use the usual commands, e.g.:
run --nodes=1 --cpus-per-task=1 --mem=16G --gres=gpu:4 --partition=k80 --pty bash
+
<tt>ddb myProject:a -v</tt>

Latest revision as of 15:22, 27 August 2018

Here we describe how to use the GPUs provided for the Basel Workshop 2018. We go through each step and use a simple tutorial dataset and project as an example. You can use the same steps described here on your own project.

The GPUs we use are located on the high performance computing cluster of the University of Basel called sciCORE (https://scicore.unibas.ch) which uses the SLURM queuing system. A queuing system coordinates the access to the GPUs and is needed when there are many users using a limited amount of GPUs. You have been given the credentials needed to log in on the cluster at the beginning of the workshop.

The main idea is that we create an alignment project locally, move it to the cluster on sciCORE and then run it using a pre-installed Dynamo standalone version on sciCORE. How you can do that is described in the following steps.


On your local Matlab session with Dynamo loaded:

  • Create a tutorial project with Dynamo:
dtutorial myParticles -p myProject -M 128

We now have a tutorial dataset with 128 particles in the directory myParticles and a tutorial alignment project myProject.

  • Open the alignment project window:
dcp myProject

and under computing environment select GPU (standalone) as an environment.

  • Check and unfold the project.
  • Before moving the data to sciCORE we have to compress the project. In the project window go to Tools and then create a tarball.


On your local Linux terminal:

  • Open a local Linux terminal and navigate to the directory where you just created the tutorial dataset and project. Copy the project data (particles) to sciCORE:
rsync -avuP myParticles USERNAME@login.scicore.unibas.ch:/scicore/home/s-gpu-course/USERNAME/dynamo_projects
  • Copy the previously created tar file of the project to sciCORE:
rsync -avuP myProject.tar USERNAME@login.scicore.unibas.ch:/scicore/home/s-gpu-course/USERNAME/dynamo_projects
  • Login to your sciCORE account:
ssh -Y USERNAME@login.scicore.unibas.ch

If asked to continue type "yes". Use the provided password.


While logged in to your sciCORE account:

  • While logged in to your sciCORE account, activate dynamo:
source /scicore/home/s-gpu-course/GROUP/dynamo_activate_linux_shipped_MCR.sh
  • Go to the location where you copied the data:
cd dynamo_projects 
  • Untar the Dynamo project:
dynamo 
dvuntar myProject.tar 
exit 
  • Create a blank SLURM submission script (text file) named submit_job.sh:
nano submit_job.sh
  • Copy (and adapt) the following lines into the newly created script. Depending on your project you might have to adapt the project name and the time requested (time=hh:mm:ss) in the script. If your job will run longer than 30 minutes, set the qos to 6hours and adapt the time to anything between 30 minutes and 6 hours:
#!/bin/bash -l
#
#SBATCH --job-name=dTest
#SBATCH --qos=30min
#SBATCH --time=00:30:00
#SBATCH --mem=16G
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=1
#SBATCH --partition=pascal
#SBATCH --gres=gpu:1
#SBATCH --reservation=dynamo
module load CUDA/7.5.18
source /scicore/home/s-gpu-course/GROUP/dynamo_activate_linux_shipped_MCR.sh
cd $HOME/dynamo_projects
echo "dvput myProject -gpu_identifier_set $CUDA_VISIBLE_DEVICES" > dcommands.sh
echo "dvunfold myProject" >> dcommands.sh
dynamo dcommands.sh
chmod u=rxw ./myProject.exe
./myProject.exe


  • You can now run your alignment project by submitting the previously created script to SLURM with:
sbatch submit_job.sh
  • With the following commands you can check the overall status of the submitted jobs:

Check your status in the queue:

squeue -u USERNAME

See all users in the queue:

squeue -q 30min


To cancel the job type scancel followed by the job ID that was shown by the squeue command:

scancel my_job_id

Some ways to check the last output:

ls -rtl
tail -f slurm-45994509.out
less slurm-45994509.out

To check the last average load the standalone Dynamo environment by typing dynamo into the terminal and use the usual commands, e.g.:

ddb myProject:a -v