Difference between revisions of "MPI Cluster"
Line 5: | Line 5: | ||
# Each time you create a project, tell it to use the cluster header to produce a project execution script (extension <tt>.sh</tt>) | # Each time you create a project, tell it to use the cluster header to produce a project execution script (extension <tt>.sh</tt>) | ||
# Submit the execution script representing the project to your cluster. | # Submit the execution script representing the project to your cluster. | ||
− | |||
== Compilation == | == Compilation == | ||
− | + | The executables delivered in the ''Dynamo'' distribution should work in a Linux workstation for parallel jobs run on the same machine (through OMP threads). They are NOT enough to run parallel jobs in a cluster of different machines, which requires the MPI libraries. Thus, compiling ''Dynamo'' on your cluster requires a {{t|cc}} compiler that links the MPI libraries. In most systems, you can run the command: | |
− | |||
− | |||
− | In most systems, you can run the command: | ||
<tt>module avail</tt> | <tt>module avail</tt> | ||
Line 23: | Line 19: | ||
<tt>which mpicc</tt> | <tt>which mpicc</tt> | ||
− | should give you | + | should give you the complete path to a compiler called <tt>mpicc</tt> on your {{t|$PATH}}. If this is not the case, try with alternative syntax. |
− | If you are | + | If you are lucky enough, your cluster environment should have some information system (like a webpage) that tells you the modules that you are expected to use for MPI-compatible compilation, and the attached compilers. |
Once you know the name of the compiler that you are going to use (say, {{t|mpicc}}), you can proceed compile the MPI executables: | Once you know the name of the compiler that you are going to use (say, {{t|mpicc}}), you can proceed compile the MPI executables: | ||
Line 34: | Line 30: | ||
If you get an error during compilation, try with a different module of a different compiler inside the same module. | If you get an error during compilation, try with a different module of a different compiler inside the same module. | ||
− | + | ==Cluster header file== | |
− | ==Cluster | ||
A ''cluster header'' file allows ''Dynamo'' to produce an [[execution script]] for a project that will be understood by the specific syntax of your cluster. | A ''cluster header'' file allows ''Dynamo'' to produce an [[execution script]] for a project that will be understood by the specific syntax of your cluster. | ||
Line 41: | Line 36: | ||
== Preparing a project == | == Preparing a project == | ||
− | You need first to tune several parameters in your project, through the GUI or the command line | + | You need first to tune several parameters in your project, through the GUI or the command line. |
+ | |||
=== GUI=== | === GUI=== | ||
Line 50: | Line 46: | ||
* Make certain that the {{t|Parallelized averaging step}} in the bottom panel is set to zero. This option only applies to Matlab based computations. | * Make certain that the {{t|Parallelized averaging step}} in the bottom panel is set to zero. This option only applies to Matlab based computations. | ||
* Pass the path to the cluster header file. | * Pass the path to the cluster header file. | ||
+ | |||
===Command line=== | ===Command line=== | ||
All the steps above can be performed through the command line, using the names of the project parameters. You can follow the examples below, where a project called {{t|myProject}} gets its parameters tuned with the command {{t|dvput}} | All the steps above can be performed through the command line, using the names of the project parameters. You can follow the examples below, where a project called {{t|myProject}} gets its parameters tuned with the command {{t|dvput}} | ||
Line 58: | Line 55: | ||
* <tt>dvput myProject cluster myClusterHeader.sh </tt> | * <tt>dvput myProject cluster myClusterHeader.sh </tt> | ||
Remember that the ''Dynamo'' command {{t|dvhelp}} will list the different project parameters that can be edited by the user through {{t|dvput}} | Remember that the ''Dynamo'' command {{t|dvhelp}} will list the different project parameters that can be edited by the user through {{t|dvput}} | ||
+ | |||
+ | == Executing the project == | ||
+ | |||
+ | Once the project has the [[#Preparing a project|right parameters]], you can [[Project unfolding|unfold]] it normally to produce an [[execution script]]. Then, you submit it from the command line. The concrete syntax may change depending on the queuing system controlling your cluster, typical examples are: | ||
+ | |||
+ | <tt>qsub myProject.sh</tt> (in {{t|PBS}} queues) | ||
+ | |||
+ | or | ||
+ | |||
+ | <tt>sbatch myProject.sh</tt> (in {{t|SLURM}} queues). | ||
+ | |||
+ | It is a sane policy to check the contents of the execution script {{t|myProject.sh}} before submitting it, in order to check that everything went smooth and ''Dynamo'' was able to use the [[#Cluster header file]] to convert your project in a text file with the right syntax for your queue. | ||
+ | |||
==Performance== | ==Performance== | ||
− | In some clusters, following the above procedure without further tuning can lead ''Dynamo'' to show a very slow performance. This is normally related to the fact ''Dynamo'' as standalone works on the [[MCR libraries]]. These libraries might need some tuning for your system. | + | In some clusters, following the above procedure without further tuning can lead ''Dynamo'' to show a very slow performance. This is normally related to the fact ''Dynamo'' as standalone works on the [[MCR libraries]]. These libraries might need some tuning for your system. |
+ | |||
+ | This can be done by making certain that the [[MCR_CACHE_ROOT]] variable is set to a fast file share on your system. Most parallel clusters offer the users an area of the disk called ''scratch'', with a specially good I/O performance. If that's the case, it is a good idea to make certain that your project will have its <tt>MCR_CACHE_ROOT</tt> tuned to (a subfolder of) that location. | ||
+ | You can do this by just editing the execution script with something like: | ||
+ | |||
+ | <tt>mkdir $SCRATCH/temporal </tt> | ||
+ | <tt>export MCR_CACHE_ROOT=$SCRATCH/temporal </tt> | ||
+ | |||
+ | You can also insert this in your cluster header file, or use the {{t|mcr}} parameter of the project to instruct a particular project to use a particular | ||
+ | |||
+ | ===Several MCR extraction folders=== | ||
+ | ''Dynamo'' includes the experimental option of selecting a different {{t|MCR_CACHE_ROOT}} for each one of the spawned MPI tasks. You should try this option if you notice a too slow performance. | ||
+ | |||
==Using a cluster under Matlab== | ==Using a cluster under Matlab== | ||
− | If your cluster supports running Matlab jobs through the Distributed Computing Engine... that's perfect. You don't need to use the MPI version of ''Dynamo'': no need to compile the MPI executables, no need to design a cluster header file. You just use the {{t|destination}} parameter {{t|matlab_parfor}} and Matlab will take care of everything. | + | This is a totally different scenario. If your cluster supports running [[Matlab]] jobs through the Distributed Computing Engine... that's perfect. You don't need to use the MPI version of ''Dynamo'': no need to compile the MPI executables, no need to design a cluster header file. You just use the {{t|destination}} parameter {{t|matlab_parfor}} and Matlab will take care of everything. |
+ | |||
+ | Note that the ''engine'' should be deployed and maintained on your cluster. Having a single licence on the login node (even if the Parallel Computing Toolbox is active there) will NOT be sufficient to run ''Dynamo''-Matlab jobs on your cluster. |
Revision as of 09:51, 18 May 2016
Dynamo can be run as Template:Standalone on a cluster of CPUs. This works for alignment and classification projects. Using the Dynamo standalone in a CPU cluster requires some additional steps compared to the execution on a single server during an interactive session.
- Compile specifically for your cluster
- Create a cluster header file that will tell Dynamo about the syntax expected by your queuing system.
- Each time you create a project, tell it to use the cluster header to produce a project execution script (extension .sh)
- Submit the execution script representing the project to your cluster.
Contents
Compilation
The executables delivered in the Dynamo distribution should work in a Linux workstation for parallel jobs run on the same machine (through OMP threads). They are NOT enough to run parallel jobs in a cluster of different machines, which requires the MPI libraries. Thus, compiling Dynamo on your cluster requires a cc compiler that links the MPI libraries. In most systems, you can run the command:
module avail
on the shell of your login node to check the available modules. Modules for parallel computation typically will include an mpi-enabled compiler. You need to load one of them, for instance:
module load mpiCC
This should add some compilers to your path. They are typically called mpiCC, mpicc... It is a good idea to check the availability and syntax of the compiler provided by the module just loaded.
which mpicc
should give you the complete path to a compiler called mpicc on your $PATH. If this is not the case, try with alternative syntax.
If you are lucky enough, your cluster environment should have some information system (like a webpage) that tells you the modules that you are expected to use for MPI-compatible compilation, and the attached compilers.
Once you know the name of the compiler that you are going to use (say, mpicc), you can proceed compile the MPI executables:
cd <DYNAMO_ROOT>/mpi source dynamo_compile_mpi.sh mpicc
If you get an error during compilation, try with a different module of a different compiler inside the same module.
Cluster header file
A cluster header file allows Dynamo to produce an execution script for a project that will be understood by the specific syntax of your cluster. You have several examples of cluster header files in the <DYNAMO_ROOT>/mpi folder of your Dynamo installation.
Preparing a project
You need first to tune several parameters in your project, through the GUI or the command line.
GUI
After opening a project in the dcp GUI you need to set fo Computing Environment GUI
- Make certain that you are dialing the cluster MPI option on the.
- Select the number of cores on the field CPU cores. Each one will be handled by a separate MPI task.
- Make certain that the Parallelized averaging step in the bottom panel is set to zero. This option only applies to Matlab based computations.
- Pass the path to the cluster header file.
Command line
All the steps above can be performed through the command line, using the names of the project parameters. You can follow the examples below, where a project called myProject gets its parameters tuned with the command dvput
- dvput myProject destination mpi
- dvput myProject cores 128
- dvput myProject mwa 0
- dvput myProject cluster myClusterHeader.sh
Remember that the Dynamo command dvhelp will list the different project parameters that can be edited by the user through dvput
Executing the project
Once the project has the right parameters, you can unfold it normally to produce an execution script. Then, you submit it from the command line. The concrete syntax may change depending on the queuing system controlling your cluster, typical examples are:
qsub myProject.sh (in PBS queues)
or
sbatch myProject.sh (in SLURM queues).
It is a sane policy to check the contents of the execution script myProject.sh before submitting it, in order to check that everything went smooth and Dynamo was able to use the #Cluster header file to convert your project in a text file with the right syntax for your queue.
Performance
In some clusters, following the above procedure without further tuning can lead Dynamo to show a very slow performance. This is normally related to the fact Dynamo as standalone works on the MCR libraries. These libraries might need some tuning for your system.
This can be done by making certain that the MCR_CACHE_ROOT variable is set to a fast file share on your system. Most parallel clusters offer the users an area of the disk called scratch, with a specially good I/O performance. If that's the case, it is a good idea to make certain that your project will have its MCR_CACHE_ROOT tuned to (a subfolder of) that location. You can do this by just editing the execution script with something like:
mkdir $SCRATCH/temporal export MCR_CACHE_ROOT=$SCRATCH/temporal
You can also insert this in your cluster header file, or use the mcr parameter of the project to instruct a particular project to use a particular
Several MCR extraction folders
Dynamo includes the experimental option of selecting a different MCR_CACHE_ROOT for each one of the spawned MPI tasks. You should try this option if you notice a too slow performance.
Using a cluster under Matlab
This is a totally different scenario. If your cluster supports running Matlab jobs through the Distributed Computing Engine... that's perfect. You don't need to use the MPI version of Dynamo: no need to compile the MPI executables, no need to design a cluster header file. You just use the destination parameter matlab_parfor and Matlab will take care of everything.
Note that the engine should be deployed and maintained on your cluster. Having a single licence on the login node (even if the Parallel Computing Toolbox is active there) will NOT be sufficient to run Dynamo-Matlab jobs on your cluster.