Difference between revisions of "EMBO workshop 2020"

From Dynamo
Jump to navigation Jump to search
 
(34 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 +
Dynamo is a [[https://www.sciencedirect.com/science/article/pii/S1047847711003650 flexible toolbox]] to help you solve problems in subtomogram averaging.
 +
 +
The goal of this workshop is to teach the principles of subtomogram averaging and show you some of the ways Dynamo can help you achieve that if you want subtomogram averaging to be a part of your research.
 +
 +
If at any time after the course you need help with Dynamo or subtomogram averaging generally, don't hesitate to ask on the [[https://groups.google.com/forum/#!forum/dynamo-for-cryo-electron-tomography/join forum in Google Groups]]
 +
 +
 +
= Instructors =
 +
[https://twitter.com/baseldarra Daniel Castaño-Diez] - University of Basel
 +
 +
[https://twitter.com/alisterburt Alister Burt] - Institut de Biologie Structurale
 +
 +
= Computer setup=
 +
 +
The house is intended to be completed with the remote access setup provided by the EMBL IT services.
 +
You are however welcome to follow it by installing ''Dynamo'' on your own laptop.
 +
 +
== EMBL remote access ==
 +
 +
The protocol for accessing EMBL machines should have been handed to you with the rest of the documentation for the course.
 +
 +
=== Start ''Dynamo'' inside a Matlab session ===
 +
Once you are inside your remote terminal, open a terminal (right click on the workspace) and start Matlab. You first need to load the module:
 +
 +
<tt>module load MATLAB</tt>
 +
 +
and then start Matlab itself:
 +
 +
<tt>matlab &</tt>
 +
 +
once your Matlab session is up, you can load dynamo through:
 +
 +
<tt>run /g/cryocourse/data/dynamo/activation</tt>
 +
 +
== Own installation==
 +
If you prefer to install ''Dynamo'' on your laptop, please follow the instructions on [[Download]]. Please be aware that this setting will only work in Linux and Mac laptops; we don't have an updated ''Dynamo'' version in Windows.
 +
Matlab is not a prerequisite but we strongly recommend to use it if you have a licence (or feel motivated enough to use a free trial licence as provided by Mathworks).
 +
 +
=== Start ''Dynamo'' as standalone===
 +
In a system terminal, type:
 +
 +
<tt>source <DYNAMO_ROOT>/dynamo_activate_mac_shippedMCR.sh</tt>
  
 
= Materials =
 
= Materials =
Line 11: Line 53:
 
* [http://{{SERVERNAME}}/w/doc/misc/introductionAlignmentProjects.pdf tutorial on] the basic concept in ''Dynamo'' alignment: the ''project''.
 
* [http://{{SERVERNAME}}/w/doc/misc/introductionAlignmentProjects.pdf tutorial on] the basic concept in ''Dynamo'' alignment: the ''project''.
  
Working on your own:  
+
Working on your own:
  
 
* Basic [[Starters guide | walkthrough]]: creating a catalogue, picking particles, launching a project.
 
* Basic [[Starters guide | walkthrough]]: creating a catalogue, picking particles, launching a project.
 +
 +
* Complete the  [[advanced starters guide]] (~2 hours)
 +
The data can be found in
 +
<tt>/g/cryocourse/data/dynamo/crop.rec</tt>
 +
the chimera path you need in the tutorial is
 +
<tt>/g/easybuild/x86_64/CentOS/7/haswell/software/Chimera/1.13-foss-2017b-Python-2.7.14/bin/chimera</tt>
 +
 
* Further work:
 
* Further work:
 
** {{pdftutorial|commandline|tutorial}} on the use command line operations for general purposes.
 
** {{pdftutorial|commandline|tutorial}} on the use command line operations for general purposes.
 
** {{pdftutorial|command_line_projects|tutorial}} on the use of the command line to manage projects.
 
** {{pdftutorial|command_line_projects|tutorial}} on the use of the command line to manage projects.
  
 +
== Day 2 ==
 +
 +
===Geometric Modelling===
 +
 +
Short guided presentation:
 +
 +
*  [http://{{SERVERNAME}}/w/doc/misc/modelMembrane.pdf tutorial on] membrane modeling with <tt> dmslice </tt>
 +
* Filament models with <tt>dtmslice</tt>
 +
** [http://{{SERVERNAME}}/w/doc/misc/modelFilament.pdf tutorial ]
 +
** [[Filament model | walkthrough]]
 +
* Reusing model workflows ([[Walkthrough model worfklow reuse| walkthrough]])
 +
* Further work: catalogue
 +
 +
* In the afternoon,  we will focus on the extraction of particles  [[Walkthrough for lattices on vesicles| from densely packed spherical geometry ]]  (~1 hour) <br />
 +
The data is available at
 +
<tt>/g/cryocourse/data/dynamo/v17.rec</tt>
 +
 +
===PCA Based Classification===
 +
[[Walkthrough on PCA through the command line]]
  
 
===Template matching===
 
===Template matching===
  
''Wednesday afternoon''.
+
* [[Walkthrough for template matching | walkthrough]] for automated identification of proteosomes on a real tomogram through template matching. (~1 hour) <br />
 +
The data can be found in
 +
<tt>/g/cryocourse/data/dynamo/t20s.mrc</tt>
  
Working on your own:
+
==Day 3 ==
  
* We will follow this [[Walkthrough for template matching | walkthrough]] for automated identification of proteosomes on a real tomogram through template matching. (~1 hour) <br /> To get the data, please write in the '''linux terminal''' <br/><tt>'''wget  https://wiki.dynamo.biozentrum.unibas.ch/w/doc/data/t20s/t20s.mrc'''</tt>
 
  
== Day 2 ==
 
  
===Geometric modeling===
+
=== Interfacing with Other Tools ===
 +
Dynamo is designed to allow maximum user flexibility and to encourage users to design their own solutions to the specific problems posed by their data.
  
Working on your own:
+
Alister recently spent some time [[Integration with Warp and M  | integrating Dynamo with Warp and M]] to be able to take advantage of Dynamo for tilt-series alignment and particle picking and plug the results into M's multi-particle refinement
  
* For the  Day 2 session: complete the  [[advanced starters guide]] (~2 hours) <br / > The data can be found in <br /> <tt>/g/cryocourse/data/dynamo/crop.rec</tt>
+
=== Final exercise ===
  
* In the afternoon, we will focus on the extraction of particles [[Walkthrough for lattices on vesicles| from densely packed spherical geometry ]]  (~1 hour) <br /> To get the data, please write in the '''linux terminal''' <br/><tt>'''wget  https://wiki.dynamo.biozentrum.unibas.ch/w/doc/data/hiv/v17.rec'''</tt>
+
The main goal of this course is for you to be able to identify strategies for setting up an appropriate approach for your problem. To this end, on the last day we suggest you [[Getting a Structure from Multiple Tomograms of HIV Capsids | a problem data set to work with]]. The data for this exercise
 +
can be found at:  
  
Short guided presentation:
+
<tt> /g/cryocourse/data/dynamo</tt>
  
*  [http://{{SERVERNAME}}/w/doc/misc/modelMembrane.pdf tutorial on] membrane modeling with <tt> dmslice </tt>
+
Feel free to to use this part to articulate your final presentation on Friday.
* Filament models with <tt>dtmslice</tt>
 
** [http://{{SERVERNAME}}/w/doc/misc/modelFilament.pdf tutorial ]
 
** [[Filament model | walkthrough]]
 
* Reusing model workflows ([[Walkthrough model worfklow reuse | walkthrough]])
 
* Further work: catalogue
 
 
 
==Day 3 ==
 
  
 
=== Fiducial based alignment and reconstruction ===
 
=== Fiducial based alignment and reconstruction ===
Line 52: Line 115:
 
These new features in ''Dynamo'' are at the testing stage.
 
These new features in ''Dynamo'' are at the testing stage.
  
==== GUI based alignment of tilt series ====
+
* [[Walkthrough on GUI based tilt series alignment ]]
[[Walkthrough on GUI based tilt series alignment ]]
+
 
 +
* [[Walkthrough on command line based tilt series alignment ]]
  
==== Command line based alignment of tilt series ====
+
* [[Walkthrough on manual marker clicking  ]]
[[Walkthrough on command line based tilt series alignment ]]
 
  
==== Manual clicking on gold beads ====
 
[[Walkthrough on manual marker clicking  ]]
 
 
=== Creation of 3D scenes ===
 
=== Creation of 3D scenes ===
  
Line 70: Line 131:
 
* [[Walkthrough on placement of averages on table positions | Walkthrough ]]on depiction and manipulation of triangulations (synthetic data).
 
* [[Walkthrough on placement of averages on table positions | Walkthrough ]]on depiction and manipulation of triangulations (synthetic data).
  
 +
= Running Alignment Projects on the GPU during the workshop =
 +
During the workshop, you are provided with a remote virtual desktop and you each have access to one GPU.
 +
 +
To set up a project for GPU computing in the dcp GUI, set the following parameters under ''computing environment''
 +
 +
* Environment - GPU (standalone)
 +
* cpu cores - 1
 +
* gpu identifiers - 0
 +
* parallelized averaging step - 8 (you may adjust this for your system depending on the number of CPUs available)
 +
 +
You can check the number of CPUs available to you in matlab with the following command
 +
 +
<nowiki>!lscpu | grep 'Core(s)'</nowiki>
 +
 +
You can then press the ''check'' button to check that your project inputs make sense, followed by the ''unfold'' button.
 +
 +
Unfolding a project generates an executable for that project in the current directory, we run this executable from a terminal outside of matlab. This is useful because it allows us to continue working on other things in matlab whilst the project is running.
 +
 +
To run the project, open a terminal and navigate to the directory containing your executable. You then run
 +
 +
<nowiki>module load dynamo</nowiki>
 +
followed by
 +
<nowiki>./my_project.exe</nowiki>
 +
 +
to launch the alignment project.
  
 
= Submitting jobs For GPU computing=
 
= Submitting jobs For GPU computing=
This is applicable for subtomogram averaging projects setup to be run in the "gpu_standalone" mode under computing environment.
+
'''Note''': This should not be necessary during the workshop, but may give you an idea how to use dynamo in your own cluster later
 +
 
 +
The following is applicable for subtomogram averaging projects setup to be run in the "gpu_standalone" mode under computing environment.
  
 
Once your project is unfolded, you should have an executable file
 
Once your project is unfolded, you should have an executable file
Line 98: Line 186:
 
Save this text file in the same directory as the executable file, then submit the job using <tt> sbatch </tt>
 
Save this text file in the same directory as the executable file, then submit the job using <tt> sbatch </tt>
  
  <nowiki>sbatch dynamo_test.sl </nowiki>
+
  <nowiki>sbatch dynamo_test_gpu.sl </nowiki>
  
 
This should then confirm that the job has been submitted
 
This should then confirm that the job has been submitted
 
  <nowiki>Submitted batch job 65146729 </nowiki>
 
  <nowiki>Submitted batch job 65146729 </nowiki>
 +
 +
= Submitting jobs for CPU computing =
 +
The following is applicable for subtomogram averaging projects setup to be run in the "standalone" mode under computing environment.
 +
 +
Once your project is unfolded, you should have an executable file
 +
 +
<nowiki>wizardTestProject.exe</nowiki>
 +
 +
The EMBL-HD cluster uses slurm for job submission and resource allocation.
 +
 +
To run this project as a job on their computing resources, we should first set up a submission script, here is an <tt>example</tt> using 6 cores on 1 node
 +
 +
<nowiki>#!/bin/bash
 +
#SBATCH -N 1                        # number of nodes
 +
#SBATCH -n 6                        # number of cores
 +
#SBATCH -o slurm.%N.%j.out          # STDOUT
 +
#SBATCH -e slurm.%N.%j.err          # STDERR
 +
#SBATCH --mail-type=END,FAIL        # notifications for job done & fail
 +
#SBATCH --mail-user=alisterburt@gmail.com # send-to address
 +
 +
 +
module load dynamo
 +
./wizardTestProject.exe</nowiki>
 +
 +
Save this text file in the same directory as the executable file, then submit the job using <tt> sbatch </tt>
 +
 +
<nowiki>sbatch dynamo_test_cpu.sl </nowiki>
 +
 +
This should then confirm that the job has been submitted
 +
<nowiki>Submitted batch job 65146755 </nowiki>

Latest revision as of 15:19, 25 August 2020

Dynamo is a [flexible toolbox] to help you solve problems in subtomogram averaging.

The goal of this workshop is to teach the principles of subtomogram averaging and show you some of the ways Dynamo can help you achieve that if you want subtomogram averaging to be a part of your research.

If at any time after the course you need help with Dynamo or subtomogram averaging generally, don't hesitate to ask on the [forum in Google Groups]


Instructors

Daniel Castaño-Diez - University of Basel

Alister Burt - Institut de Biologie Structurale

Computer setup

The house is intended to be completed with the remote access setup provided by the EMBL IT services. You are however welcome to follow it by installing Dynamo on your own laptop.

EMBL remote access

The protocol for accessing EMBL machines should have been handed to you with the rest of the documentation for the course.

Start Dynamo inside a Matlab session

Once you are inside your remote terminal, open a terminal (right click on the workspace) and start Matlab. You first need to load the module:

module load MATLAB

and then start Matlab itself:

matlab &

once your Matlab session is up, you can load dynamo through:

run /g/cryocourse/data/dynamo/activation

Own installation

If you prefer to install Dynamo on your laptop, please follow the instructions on Download. Please be aware that this setting will only work in Linux and Mac laptops; we don't have an updated Dynamo version in Windows. Matlab is not a prerequisite but we strongly recommend to use it if you have a licence (or feel motivated enough to use a free trial licence as provided by Mathworks).

Start Dynamo as standalone

In a system terminal, type:

source <DYNAMO_ROOT>/dynamo_activate_mac_shippedMCR.sh

Materials

Day 1

Basic principles

Guided presentation:

Working on your own:

  • Basic walkthrough: creating a catalogue, picking particles, launching a project.

The data can be found in

/g/cryocourse/data/dynamo/crop.rec 

the chimera path you need in the tutorial is

/g/easybuild/x86_64/CentOS/7/haswell/software/Chimera/1.13-foss-2017b-Python-2.7.14/bin/chimera
  • Further work:
    • tutorial on the use command line operations for general purposes.
    • tutorial on the use of the command line to manage projects.

Day 2

Geometric Modelling

Short guided presentation:

The data is available at

/g/cryocourse/data/dynamo/v17.rec

PCA Based Classification

Walkthrough on PCA through the command line

Template matching

  • walkthrough for automated identification of proteosomes on a real tomogram through template matching. (~1 hour)

The data can be found in

/g/cryocourse/data/dynamo/t20s.mrc

Day 3

Interfacing with Other Tools

Dynamo is designed to allow maximum user flexibility and to encourage users to design their own solutions to the specific problems posed by their data.

Alister recently spent some time integrating Dynamo with Warp and M to be able to take advantage of Dynamo for tilt-series alignment and particle picking and plug the results into M's multi-particle refinement

Final exercise

The main goal of this course is for you to be able to identify strategies for setting up an appropriate approach for your problem. To this end, on the last day we suggest you a problem data set to work with. The data for this exercise can be found at:

 /g/cryocourse/data/dynamo

Feel free to to use this part to articulate your final presentation on Friday.

Fiducial based alignment and reconstruction

These new features in Dynamo are at the testing stage.

Creation of 3D scenes

Working on your own:

Further support material.

  • Walkthrough on depiction and manipulation of triangulations (synthetic data).

Running Alignment Projects on the GPU during the workshop

During the workshop, you are provided with a remote virtual desktop and you each have access to one GPU.

To set up a project for GPU computing in the dcp GUI, set the following parameters under computing environment

  • Environment - GPU (standalone)
  • cpu cores - 1
  • gpu identifiers - 0
  • parallelized averaging step - 8 (you may adjust this for your system depending on the number of CPUs available)

You can check the number of CPUs available to you in matlab with the following command

!lscpu | grep 'Core(s)'

You can then press the check button to check that your project inputs make sense, followed by the unfold button.

Unfolding a project generates an executable for that project in the current directory, we run this executable from a terminal outside of matlab. This is useful because it allows us to continue working on other things in matlab whilst the project is running.

To run the project, open a terminal and navigate to the directory containing your executable. You then run

module load dynamo

followed by

./my_project.exe

to launch the alignment project.

Submitting jobs For GPU computing

Note: This should not be necessary during the workshop, but may give you an idea how to use dynamo in your own cluster later

The following is applicable for subtomogram averaging projects setup to be run in the "gpu_standalone" mode under computing environment.

Once your project is unfolded, you should have an executable file

wizardTestProject.exe

The EMBL-HD cluster uses slurm for job submission and resource allocation.

To run this project as a job on their computing resources, we should first set up a submission script, here is an example

#!/bin/bash
#SBATCH -N 1                        # number of nodes
#SBATCH -n 1                        # number of cores
#SBATCH -o slurm.%N.%j.out          # STDOUT
#SBATCH -e slurm.%N.%j.err          # STDERR
#SBATCH --mail-type=END,FAIL        # notifications for job done & fail
#SBATCH --mail-user=alisterburt@gmail.com # send-to address
#SBATCH -p gpu						# select gpu usage
#SBATCH --gres=gpu:1				# number of gpus (if using gpus)


module load dynamo
./wizardTestProject.exe

Save this text file in the same directory as the executable file, then submit the job using sbatch

sbatch dynamo_test_gpu.sl 

This should then confirm that the job has been submitted

Submitted batch job 65146729 

Submitting jobs for CPU computing

The following is applicable for subtomogram averaging projects setup to be run in the "standalone" mode under computing environment.

Once your project is unfolded, you should have an executable file

wizardTestProject.exe

The EMBL-HD cluster uses slurm for job submission and resource allocation.

To run this project as a job on their computing resources, we should first set up a submission script, here is an example using 6 cores on 1 node

#!/bin/bash
#SBATCH -N 1                        # number of nodes
#SBATCH -n 6                        # number of cores
#SBATCH -o slurm.%N.%j.out          # STDOUT
#SBATCH -e slurm.%N.%j.err          # STDERR
#SBATCH --mail-type=END,FAIL        # notifications for job done & fail
#SBATCH --mail-user=alisterburt@gmail.com # send-to address


module load dynamo
./wizardTestProject.exe

Save this text file in the same directory as the executable file, then submit the job using sbatch

sbatch dynamo_test_cpu.sl 

This should then confirm that the job has been submitted

Submitted batch job 65146755