Difference between revisions of "MPI Cluster"

From Dynamo
Jump to navigation Jump to search
Line 1: Line 1:
''Dynamo'' can be run as {{standalone}} on a cluster of CPUs. This works for alignment and classification projects.
+
''Dynamo'' can be run as {{standalone}} on a cluster of CPUs. This works for alignment and classification projects. Using the ''Dynamo'' standalone in  a CPU cluster requires some additional steps compared to the execution on a single server during an interactive session.
 
 
Using the ''Dynamo'' standalone in  a CPU cluster requires some additional steps compared to the execution on a single server during an interactive session.
 
  
 
# Compile specifically for your cluster
 
# Compile specifically for your cluster
Line 7: Line 5:
 
# Each time you create a project, tell it to use the cluster header to produce a project execution script (extension <tt>.sh</tt>)
 
# Each time you create a project, tell it to use the cluster header to produce a project execution script (extension <tt>.sh</tt>)
 
# Submit the execution script representing the project to your cluster.
 
# Submit the execution script representing the project to your cluster.
 +
  
 
== Compilation ==
 
== Compilation ==
Line 40: Line 39:
 
A ''cluster header'' file allows ''Dynamo'' to produce an [[execution script]] for a project that will be understood by the specific syntax of your cluster.  
 
A ''cluster header'' file allows ''Dynamo'' to produce an [[execution script]] for a project that will be understood by the specific syntax of your cluster.  
 
You have several examples of ''cluster header'' files in the <tt><DYNAMO_ROOT>/mpi</tt> folder of your ''Dynamo'' installation.  
 
You have several examples of ''cluster header'' files in the <tt><DYNAMO_ROOT>/mpi</tt> folder of your ''Dynamo'' installation.  
 
  
 
== Preparing a project ==
 
== Preparing a project ==
 
 
You need first to tune several parameters in your project, through the GUI or the command line
 
You need first to tune several parameters in your project, through the GUI or the command line
 
 
 
=== GUI===
 
=== GUI===
  
Line 55: Line 50:
 
* Make certain that the {{t|Parallelized averaging step}} in the bottom panel is set to zero. This option only applies to Matlab based computations.
 
* Make certain that the {{t|Parallelized averaging step}} in the bottom panel is set to zero. This option only applies to Matlab based computations.
 
* Pass the path to the cluster header file.
 
* Pass the path to the cluster header file.
 
 
===Command line===
 
===Command line===
 
 
All the steps above can be performed through the command line, using the names of the project parameters. You can follow the examples below, where a project called {{t|myProject}} gets its parameters tuned with the command {{t|dvput}}  
 
All the steps above can be performed through the command line, using the names of the project parameters. You can follow the examples below, where a project called {{t|myProject}} gets its parameters tuned with the command {{t|dvput}}  
  
Line 64: Line 57:
 
* <tt>dvput myProject mwa 0 </tt>
 
* <tt>dvput myProject mwa 0 </tt>
 
* <tt>dvput myProject cluster myClusterHeader.sh </tt>
 
* <tt>dvput myProject cluster myClusterHeader.sh </tt>
 
 
 
Remember that the ''Dynamo'' command {{t|dvhelp}} will list the different project parameters that can be edited by the user through {{t|dvput}}
 
Remember that the ''Dynamo'' command {{t|dvhelp}} will list the different project parameters that can be edited by the user through {{t|dvput}}
 
 
==Performance==
 
==Performance==
  
 
In some clusters, following the above procedure without further tuning can lead ''Dynamo'' to show a very slow performance. This is normally related to the fact ''Dynamo'' as standalone works on the [[MCR libraries]]. These libraries might need some tuning for your system.
 
In some clusters, following the above procedure without further tuning can lead ''Dynamo'' to show a very slow performance. This is normally related to the fact ''Dynamo'' as standalone works on the [[MCR libraries]]. These libraries might need some tuning for your system.
 
 
 
 
 
 
==Using a cluster under Matlab==
 
==Using a cluster under Matlab==
 
If your cluster supports running Matlab jobs through the Distributed Computing Engine... that's perfect. You don't need to use the MPI version of ''Dynamo'': no need to compile the MPI executables, no need to design a cluster header file. You just use the {{t|destination}} parameter {{t|matlab_parfor}} and Matlab will take care of everything.
 
If your cluster supports running Matlab jobs through the Distributed Computing Engine... that's perfect. You don't need to use the MPI version of ''Dynamo'': no need to compile the MPI executables, no need to design a cluster header file. You just use the {{t|destination}} parameter {{t|matlab_parfor}} and Matlab will take care of everything.

Revision as of 10:23, 18 May 2016

Dynamo can be run as Template:Standalone on a cluster of CPUs. This works for alignment and classification projects. Using the Dynamo standalone in a CPU cluster requires some additional steps compared to the execution on a single server during an interactive session.

  1. Compile specifically for your cluster
  2. Create a cluster header file that will tell Dynamo about the syntax expected by your queuing system.
  3. Each time you create a project, tell it to use the cluster header to produce a project execution script (extension .sh)
  4. Submit the execution script representing the project to your cluster.


Compilation

Compiling Dynamo on your cluster requires a cc compiler that links the MPI libraries.

In most systems, you can run the command:

module avail

on the shell of your login node to check the available modules. Modules for parallel computation typically will include an mpi-enabled compiler. You need to load one of them, for instance:

module load mpiCC

This should add some compilers to your path. They are typically called mpiCC, mpicc... It is a good idea to check the availability and syntax of the compiler provided by the module just loaded.

which mpicc

should give you a complete path to a compiler called mpicc on your path. If this is not the case, try with alternative syntax.

If you are fortunate enough, your cluster environment should have some information system (like a webpage) that tells you the modules that you are expected to use for compilation, and the attached compilers.

Once you know the name of the compiler that you are going to use (say, mpicc), you can proceed compile the MPI executables:

cd <DYNAMO_ROOT>/mpi source dynamo_compile_mpi.sh mpicc

If you get an error during compilation, try with a different module of a different compiler inside the same module.


Cluster Header file

A cluster header file allows Dynamo to produce an execution script for a project that will be understood by the specific syntax of your cluster. You have several examples of cluster header files in the <DYNAMO_ROOT>/mpi folder of your Dynamo installation.

Preparing a project

You need first to tune several parameters in your project, through the GUI or the command line

GUI

After opening a project in the dcp GUI you need to set fo Computing Environment GUI
  • Make certain that you are dialing the cluster MPI option on the.
  • Select the number of cores on the field CPU cores. Each one will be handled by a separate MPI task.
  • Make certain that the Parallelized averaging step in the bottom panel is set to zero. This option only applies to Matlab based computations.
  • Pass the path to the cluster header file.

Command line

All the steps above can be performed through the command line, using the names of the project parameters. You can follow the examples below, where a project called myProject gets its parameters tuned with the command dvput

  • dvput myProject destination mpi
  • dvput myProject cores 128
  • dvput myProject mwa 0
  • dvput myProject cluster myClusterHeader.sh

Remember that the Dynamo command dvhelp will list the different project parameters that can be edited by the user through dvput

Performance

In some clusters, following the above procedure without further tuning can lead Dynamo to show a very slow performance. This is normally related to the fact Dynamo as standalone works on the MCR libraries. These libraries might need some tuning for your system.

Using a cluster under Matlab

If your cluster supports running Matlab jobs through the Distributed Computing Engine... that's perfect. You don't need to use the MPI version of Dynamo: no need to compile the MPI executables, no need to design a cluster header file. You just use the destination parameter matlab_parfor and Matlab will take care of everything.