Difference between revisions of "EMBO workshop 2020"

From Dynamo
Jump to navigation Jump to search
 
(11 intermediate revisions by 2 users not shown)
Line 10: Line 10:
  
 
[https://twitter.com/alisterburt Alister Burt] - Institut de Biologie Structurale
 
[https://twitter.com/alisterburt Alister Burt] - Institut de Biologie Structurale
 +
 +
= Computer setup=
 +
 +
The house is intended to be completed with the remote access setup provided by the EMBL IT services.
 +
You are however welcome to follow it by installing ''Dynamo'' on your own laptop.
 +
 +
== EMBL remote access ==
 +
 +
The protocol for accessing EMBL machines should have been handed to you with the rest of the documentation for the course.
 +
 +
=== Start ''Dynamo'' inside a Matlab session ===
 +
Once you are inside your remote terminal, open a terminal (right click on the workspace) and start Matlab. You first need to load the module:
 +
 +
<tt>module load MATLAB</tt>
 +
 +
and then start Matlab itself:
 +
 +
<tt>matlab &</tt>
 +
 +
once your Matlab session is up, you can load dynamo through:
 +
 +
<tt>run /g/cryocourse/data/dynamo/activation</tt>
 +
 +
== Own installation==
 +
If you prefer to install ''Dynamo'' on your laptop, please follow the instructions on [[Download]]. Please be aware that this setting will only work in Linux and Mac laptops; we don't have an updated ''Dynamo'' version in Windows.
 +
Matlab is not a prerequisite but we strongly recommend to use it if you have a licence (or feel motivated enough to use a free trial licence as provided by Mathworks).
 +
 +
=== Start ''Dynamo'' as standalone===
 +
In a system terminal, type:
 +
 +
<tt>source <DYNAMO_ROOT>/dynamo_activate_mac_shippedMCR.sh</tt>
  
 
= Materials =
 
= Materials =
Line 55: Line 86:
 
===PCA Based Classification===
 
===PCA Based Classification===
 
[[Walkthrough on PCA through the command line]]
 
[[Walkthrough on PCA through the command line]]
 +
 +
===Template matching===
 +
 +
* [[Walkthrough for template matching | walkthrough]] for automated identification of proteosomes on a real tomogram through template matching. (~1 hour) <br />
 +
The data can be found in
 +
<tt>/g/cryocourse/data/dynamo/t20s.mrc</tt>
  
 
==Day 3 ==
 
==Day 3 ==
 +
 +
 +
 +
=== Interfacing with Other Tools ===
 +
Dynamo is designed to allow maximum user flexibility and to encourage users to design their own solutions to the specific problems posed by their data.
 +
 +
Alister recently spent some time [[Integration with Warp and M  | integrating Dynamo with Warp and M]] to be able to take advantage of Dynamo for tilt-series alignment and particle picking and plug the results into M's multi-particle refinement
 +
 +
=== Final exercise ===
 +
 +
The main goal of this course is for you to be able to identify strategies for setting up an appropriate approach for your problem. To this end, on the last day we suggest you  [[Getting a Structure from Multiple Tomograms of HIV Capsids | a problem data set to work with]].  The data for this exercise
 +
can be found at:
 +
 +
<tt> /g/cryocourse/data/dynamo</tt>
 +
 +
Feel free to to use this part to articulate your final presentation on Friday.
  
 
=== Fiducial based alignment and reconstruction ===
 
=== Fiducial based alignment and reconstruction ===
Line 67: Line 120:
  
 
* [[Walkthrough on manual marker clicking  ]]
 
* [[Walkthrough on manual marker clicking  ]]
 
=== Interfacing with Other Tools ===
 
Dynamo is designed to allow maximum user flexibility and to encourage users to design their own solutions to the specific problems posed by their data.
 
 
Alister recently spent some time integrating Dynamo with Warp and M to be able to take advantage of Dynamo for tilt-series alignment and particle picking and plug the results into M's multi-particle refinement
 
[[Integration with Warp and M ]]
 
  
 
=== Creation of 3D scenes ===
 
=== Creation of 3D scenes ===
Line 83: Line 130:
  
 
* [[Walkthrough on placement of averages on table positions | Walkthrough ]]on depiction and manipulation of triangulations (synthetic data).
 
* [[Walkthrough on placement of averages on table positions | Walkthrough ]]on depiction and manipulation of triangulations (synthetic data).
===Template matching===
 
  
* [[Walkthrough for template matching | walkthrough]] for automated identification of proteosomes on a real tomogram through template matching. (~1 hour) <br />  
+
= Running Alignment Projects on the GPU during the workshop =
The data can be found in
+
During the workshop, you are provided with a remote virtual desktop and you each have access to one GPU.
  <tt>/g/cryocourse/data/dynamo/t20s.mrc</tt>
+
 
 +
To set up a project for GPU computing in the dcp GUI, set the following parameters under ''computing environment''
 +
 
 +
* Environment - GPU (standalone)
 +
* cpu cores - 1
 +
* gpu identifiers - 0
 +
* parallelized averaging step - 8 (you may adjust this for your system depending on the number of CPUs available)
 +
 
 +
You can check the number of CPUs available to you in matlab with the following command
 +
 
 +
<nowiki>!lscpu | grep 'Core(s)'</nowiki>
 +
 
 +
You can then press the ''check'' button to check that your project inputs make sense, followed by the ''unfold'' button.
 +
 
 +
Unfolding a project generates an executable for that project in the current directory, we run this executable from a terminal outside of matlab. This is useful because it allows us to continue working on other things in matlab whilst the project is running.
 +
 
 +
To run the project, open a terminal and navigate to the directory containing your executable. You then run
 +
 
 +
  <nowiki>module load dynamo</nowiki>
 +
followed by
 +
<nowiki>./my_project.exe</nowiki>
 +
 
 +
to launch the alignment project.
  
 
= Submitting jobs For GPU computing=
 
= Submitting jobs For GPU computing=
 +
'''Note''': This should not be necessary during the workshop, but may give you an idea how to use dynamo in your own cluster later
 +
 
The following is applicable for subtomogram averaging projects setup to be run in the "gpu_standalone" mode under computing environment.
 
The following is applicable for subtomogram averaging projects setup to be run in the "gpu_standalone" mode under computing environment.
  

Latest revision as of 15:19, 25 August 2020

Dynamo is a [flexible toolbox] to help you solve problems in subtomogram averaging.

The goal of this workshop is to teach the principles of subtomogram averaging and show you some of the ways Dynamo can help you achieve that if you want subtomogram averaging to be a part of your research.

If at any time after the course you need help with Dynamo or subtomogram averaging generally, don't hesitate to ask on the [forum in Google Groups]


Instructors

Daniel Castaño-Diez - University of Basel

Alister Burt - Institut de Biologie Structurale

Computer setup

The house is intended to be completed with the remote access setup provided by the EMBL IT services. You are however welcome to follow it by installing Dynamo on your own laptop.

EMBL remote access

The protocol for accessing EMBL machines should have been handed to you with the rest of the documentation for the course.

Start Dynamo inside a Matlab session

Once you are inside your remote terminal, open a terminal (right click on the workspace) and start Matlab. You first need to load the module:

module load MATLAB

and then start Matlab itself:

matlab &

once your Matlab session is up, you can load dynamo through:

run /g/cryocourse/data/dynamo/activation

Own installation

If you prefer to install Dynamo on your laptop, please follow the instructions on Download. Please be aware that this setting will only work in Linux and Mac laptops; we don't have an updated Dynamo version in Windows. Matlab is not a prerequisite but we strongly recommend to use it if you have a licence (or feel motivated enough to use a free trial licence as provided by Mathworks).

Start Dynamo as standalone

In a system terminal, type:

source <DYNAMO_ROOT>/dynamo_activate_mac_shippedMCR.sh

Materials

Day 1

Basic principles

Guided presentation:

Working on your own:

  • Basic walkthrough: creating a catalogue, picking particles, launching a project.

The data can be found in

/g/cryocourse/data/dynamo/crop.rec 

the chimera path you need in the tutorial is

/g/easybuild/x86_64/CentOS/7/haswell/software/Chimera/1.13-foss-2017b-Python-2.7.14/bin/chimera
  • Further work:
    • tutorial on the use command line operations for general purposes.
    • tutorial on the use of the command line to manage projects.

Day 2

Geometric Modelling

Short guided presentation:

The data is available at

/g/cryocourse/data/dynamo/v17.rec

PCA Based Classification

Walkthrough on PCA through the command line

Template matching

  • walkthrough for automated identification of proteosomes on a real tomogram through template matching. (~1 hour)

The data can be found in

/g/cryocourse/data/dynamo/t20s.mrc

Day 3

Interfacing with Other Tools

Dynamo is designed to allow maximum user flexibility and to encourage users to design their own solutions to the specific problems posed by their data.

Alister recently spent some time integrating Dynamo with Warp and M to be able to take advantage of Dynamo for tilt-series alignment and particle picking and plug the results into M's multi-particle refinement

Final exercise

The main goal of this course is for you to be able to identify strategies for setting up an appropriate approach for your problem. To this end, on the last day we suggest you a problem data set to work with. The data for this exercise can be found at:

 /g/cryocourse/data/dynamo

Feel free to to use this part to articulate your final presentation on Friday.

Fiducial based alignment and reconstruction

These new features in Dynamo are at the testing stage.

Creation of 3D scenes

Working on your own:

Further support material.

  • Walkthrough on depiction and manipulation of triangulations (synthetic data).

Running Alignment Projects on the GPU during the workshop

During the workshop, you are provided with a remote virtual desktop and you each have access to one GPU.

To set up a project for GPU computing in the dcp GUI, set the following parameters under computing environment

  • Environment - GPU (standalone)
  • cpu cores - 1
  • gpu identifiers - 0
  • parallelized averaging step - 8 (you may adjust this for your system depending on the number of CPUs available)

You can check the number of CPUs available to you in matlab with the following command

!lscpu | grep 'Core(s)'

You can then press the check button to check that your project inputs make sense, followed by the unfold button.

Unfolding a project generates an executable for that project in the current directory, we run this executable from a terminal outside of matlab. This is useful because it allows us to continue working on other things in matlab whilst the project is running.

To run the project, open a terminal and navigate to the directory containing your executable. You then run

module load dynamo

followed by

./my_project.exe

to launch the alignment project.

Submitting jobs For GPU computing

Note: This should not be necessary during the workshop, but may give you an idea how to use dynamo in your own cluster later

The following is applicable for subtomogram averaging projects setup to be run in the "gpu_standalone" mode under computing environment.

Once your project is unfolded, you should have an executable file

wizardTestProject.exe

The EMBL-HD cluster uses slurm for job submission and resource allocation.

To run this project as a job on their computing resources, we should first set up a submission script, here is an example

#!/bin/bash
#SBATCH -N 1                        # number of nodes
#SBATCH -n 1                        # number of cores
#SBATCH -o slurm.%N.%j.out          # STDOUT
#SBATCH -e slurm.%N.%j.err          # STDERR
#SBATCH --mail-type=END,FAIL        # notifications for job done & fail
#SBATCH --mail-user=alisterburt@gmail.com # send-to address
#SBATCH -p gpu						# select gpu usage
#SBATCH --gres=gpu:1				# number of gpus (if using gpus)


module load dynamo
./wizardTestProject.exe

Save this text file in the same directory as the executable file, then submit the job using sbatch

sbatch dynamo_test_gpu.sl 

This should then confirm that the job has been submitted

Submitted batch job 65146729 

Submitting jobs for CPU computing

The following is applicable for subtomogram averaging projects setup to be run in the "standalone" mode under computing environment.

Once your project is unfolded, you should have an executable file

wizardTestProject.exe

The EMBL-HD cluster uses slurm for job submission and resource allocation.

To run this project as a job on their computing resources, we should first set up a submission script, here is an example using 6 cores on 1 node

#!/bin/bash
#SBATCH -N 1                        # number of nodes
#SBATCH -n 6                        # number of cores
#SBATCH -o slurm.%N.%j.out          # STDOUT
#SBATCH -e slurm.%N.%j.err          # STDERR
#SBATCH --mail-type=END,FAIL        # notifications for job done & fail
#SBATCH --mail-user=alisterburt@gmail.com # send-to address


module load dynamo
./wizardTestProject.exe

Save this text file in the same directory as the executable file, then submit the job using sbatch

sbatch dynamo_test_cpu.sl 

This should then confirm that the job has been submitted

Submitted batch job 65146755