Difference between revisions of "EMBO workshop 2020"
AlisterBurt (talk | contribs) |
|||
(10 intermediate revisions by 2 users not shown) | |||
Line 15: | Line 15: | ||
The house is intended to be completed with the remote access setup provided by the EMBL IT services. | The house is intended to be completed with the remote access setup provided by the EMBL IT services. | ||
You are however welcome to follow it by installing ''Dynamo'' on your own laptop. | You are however welcome to follow it by installing ''Dynamo'' on your own laptop. | ||
+ | |||
+ | == EMBL remote access == | ||
+ | |||
+ | The protocol for accessing EMBL machines should have been handed to you with the rest of the documentation for the course. | ||
+ | |||
+ | === Start ''Dynamo'' inside a Matlab session === | ||
+ | Once you are inside your remote terminal, open a terminal (right click on the workspace) and start Matlab. You first need to load the module: | ||
+ | |||
+ | <tt>module load MATLAB</tt> | ||
+ | |||
+ | and then start Matlab itself: | ||
+ | |||
+ | <tt>matlab &</tt> | ||
+ | |||
+ | once your Matlab session is up, you can load dynamo through: | ||
+ | |||
+ | <tt>run /g/cryocourse/data/dynamo/activation</tt> | ||
== Own installation== | == Own installation== | ||
If you prefer to install ''Dynamo'' on your laptop, please follow the instructions on [[Download]]. Please be aware that this setting will only work in Linux and Mac laptops; we don't have an updated ''Dynamo'' version in Windows. | If you prefer to install ''Dynamo'' on your laptop, please follow the instructions on [[Download]]. Please be aware that this setting will only work in Linux and Mac laptops; we don't have an updated ''Dynamo'' version in Windows. | ||
Matlab is not a prerequisite but we strongly recommend to use it if you have a licence (or feel motivated enough to use a free trial licence as provided by Mathworks). | Matlab is not a prerequisite but we strongly recommend to use it if you have a licence (or feel motivated enough to use a free trial licence as provided by Mathworks). | ||
+ | |||
+ | === Start ''Dynamo'' as standalone=== | ||
+ | In a system terminal, type: | ||
+ | |||
+ | <tt>source <DYNAMO_ROOT>/dynamo_activate_mac_shippedMCR.sh</tt> | ||
= Materials = | = Materials = | ||
Line 64: | Line 86: | ||
===PCA Based Classification=== | ===PCA Based Classification=== | ||
[[Walkthrough on PCA through the command line]] | [[Walkthrough on PCA through the command line]] | ||
+ | |||
+ | ===Template matching=== | ||
+ | |||
+ | * [[Walkthrough for template matching | walkthrough]] for automated identification of proteosomes on a real tomogram through template matching. (~1 hour) <br /> | ||
+ | The data can be found in | ||
+ | <tt>/g/cryocourse/data/dynamo/t20s.mrc</tt> | ||
==Day 3 == | ==Day 3 == | ||
+ | |||
+ | |||
+ | |||
+ | === Interfacing with Other Tools === | ||
+ | Dynamo is designed to allow maximum user flexibility and to encourage users to design their own solutions to the specific problems posed by their data. | ||
+ | |||
+ | Alister recently spent some time [[Integration with Warp and M | integrating Dynamo with Warp and M]] to be able to take advantage of Dynamo for tilt-series alignment and particle picking and plug the results into M's multi-particle refinement | ||
+ | |||
+ | === Final exercise === | ||
+ | |||
+ | The main goal of this course is for you to be able to identify strategies for setting up an appropriate approach for your problem. To this end, on the last day we suggest you [[Getting a Structure from Multiple Tomograms of HIV Capsids | a problem data set to work with]]. The data for this exercise | ||
+ | can be found at: | ||
+ | |||
+ | <tt> /g/cryocourse/data/dynamo</tt> | ||
+ | |||
+ | Feel free to to use this part to articulate your final presentation on Friday. | ||
=== Fiducial based alignment and reconstruction === | === Fiducial based alignment and reconstruction === | ||
Line 76: | Line 120: | ||
* [[Walkthrough on manual marker clicking ]] | * [[Walkthrough on manual marker clicking ]] | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
=== Creation of 3D scenes === | === Creation of 3D scenes === | ||
Line 92: | Line 130: | ||
* [[Walkthrough on placement of averages on table positions | Walkthrough ]]on depiction and manipulation of triangulations (synthetic data). | * [[Walkthrough on placement of averages on table positions | Walkthrough ]]on depiction and manipulation of triangulations (synthetic data). | ||
− | |||
− | + | = Running Alignment Projects on the GPU during the workshop = | |
− | + | During the workshop, you are provided with a remote virtual desktop and you each have access to one GPU. | |
− | < | + | |
+ | To set up a project for GPU computing in the dcp GUI, set the following parameters under ''computing environment'' | ||
+ | |||
+ | * Environment - GPU (standalone) | ||
+ | * cpu cores - 1 | ||
+ | * gpu identifiers - 0 | ||
+ | * parallelized averaging step - 8 (you may adjust this for your system depending on the number of CPUs available) | ||
+ | |||
+ | You can check the number of CPUs available to you in matlab with the following command | ||
+ | |||
+ | <nowiki>!lscpu | grep 'Core(s)'</nowiki> | ||
+ | |||
+ | You can then press the ''check'' button to check that your project inputs make sense, followed by the ''unfold'' button. | ||
+ | |||
+ | Unfolding a project generates an executable for that project in the current directory, we run this executable from a terminal outside of matlab. This is useful because it allows us to continue working on other things in matlab whilst the project is running. | ||
+ | |||
+ | To run the project, open a terminal and navigate to the directory containing your executable. You then run | ||
+ | |||
+ | <nowiki>module load dynamo</nowiki> | ||
+ | followed by | ||
+ | <nowiki>./my_project.exe</nowiki> | ||
+ | |||
+ | to launch the alignment project. | ||
= Submitting jobs For GPU computing= | = Submitting jobs For GPU computing= | ||
+ | '''Note''': This should not be necessary during the workshop, but may give you an idea how to use dynamo in your own cluster later | ||
+ | |||
The following is applicable for subtomogram averaging projects setup to be run in the "gpu_standalone" mode under computing environment. | The following is applicable for subtomogram averaging projects setup to be run in the "gpu_standalone" mode under computing environment. | ||
Latest revision as of 15:19, 25 August 2020
Dynamo is a [flexible toolbox] to help you solve problems in subtomogram averaging.
The goal of this workshop is to teach the principles of subtomogram averaging and show you some of the ways Dynamo can help you achieve that if you want subtomogram averaging to be a part of your research.
If at any time after the course you need help with Dynamo or subtomogram averaging generally, don't hesitate to ask on the [forum in Google Groups]
Contents
Instructors
Daniel Castaño-Diez - University of Basel
Alister Burt - Institut de Biologie Structurale
Computer setup
The house is intended to be completed with the remote access setup provided by the EMBL IT services. You are however welcome to follow it by installing Dynamo on your own laptop.
EMBL remote access
The protocol for accessing EMBL machines should have been handed to you with the rest of the documentation for the course.
Start Dynamo inside a Matlab session
Once you are inside your remote terminal, open a terminal (right click on the workspace) and start Matlab. You first need to load the module:
module load MATLAB
and then start Matlab itself:
matlab &
once your Matlab session is up, you can load dynamo through:
run /g/cryocourse/data/dynamo/activation
Own installation
If you prefer to install Dynamo on your laptop, please follow the instructions on Download. Please be aware that this setting will only work in Linux and Mac laptops; we don't have an updated Dynamo version in Windows. Matlab is not a prerequisite but we strongly recommend to use it if you have a licence (or feel motivated enough to use a free trial licence as provided by Mathworks).
Start Dynamo as standalone
In a system terminal, type:
source <DYNAMO_ROOT>/dynamo_activate_mac_shippedMCR.sh
Materials
Day 1
Basic principles
Guided presentation:
- Basic Dynamo jargon
- tutorial on basic elements: help, data and metadata formats.
- tutorial on the basic concept in Dynamo alignment: the project.
Working on your own:
- Basic walkthrough: creating a catalogue, picking particles, launching a project.
- Complete the advanced starters guide (~2 hours)
The data can be found in
/g/cryocourse/data/dynamo/crop.rec
the chimera path you need in the tutorial is
/g/easybuild/x86_64/CentOS/7/haswell/software/Chimera/1.13-foss-2017b-Python-2.7.14/bin/chimera
- Further work:
Day 2
Geometric Modelling
Short guided presentation:
- tutorial on membrane modeling with dmslice
- Filament models with dtmslice
- Reusing model workflows ( walkthrough)
- Further work: catalogue
- In the afternoon, we will focus on the extraction of particles from densely packed spherical geometry (~1 hour)
The data is available at
/g/cryocourse/data/dynamo/v17.rec
PCA Based Classification
Walkthrough on PCA through the command line
Template matching
- walkthrough for automated identification of proteosomes on a real tomogram through template matching. (~1 hour)
The data can be found in
/g/cryocourse/data/dynamo/t20s.mrc
Day 3
Interfacing with Other Tools
Dynamo is designed to allow maximum user flexibility and to encourage users to design their own solutions to the specific problems posed by their data.
Alister recently spent some time integrating Dynamo with Warp and M to be able to take advantage of Dynamo for tilt-series alignment and particle picking and plug the results into M's multi-particle refinement
Final exercise
The main goal of this course is for you to be able to identify strategies for setting up an appropriate approach for your problem. To this end, on the last day we suggest you a problem data set to work with. The data for this exercise can be found at:
/g/cryocourse/data/dynamo
Feel free to to use this part to articulate your final presentation on Friday.
Fiducial based alignment and reconstruction
These new features in Dynamo are at the testing stage.
Creation of 3D scenes
Working on your own:
- Walkthrough on the FHV data set (~1hour)
Further support material.
- Walkthrough on depiction and manipulation of triangulations (synthetic data).
Running Alignment Projects on the GPU during the workshop
During the workshop, you are provided with a remote virtual desktop and you each have access to one GPU.
To set up a project for GPU computing in the dcp GUI, set the following parameters under computing environment
- Environment - GPU (standalone)
- cpu cores - 1
- gpu identifiers - 0
- parallelized averaging step - 8 (you may adjust this for your system depending on the number of CPUs available)
You can check the number of CPUs available to you in matlab with the following command
!lscpu | grep 'Core(s)'
You can then press the check button to check that your project inputs make sense, followed by the unfold button.
Unfolding a project generates an executable for that project in the current directory, we run this executable from a terminal outside of matlab. This is useful because it allows us to continue working on other things in matlab whilst the project is running.
To run the project, open a terminal and navigate to the directory containing your executable. You then run
module load dynamo
followed by
./my_project.exe
to launch the alignment project.
Submitting jobs For GPU computing
Note: This should not be necessary during the workshop, but may give you an idea how to use dynamo in your own cluster later
The following is applicable for subtomogram averaging projects setup to be run in the "gpu_standalone" mode under computing environment.
Once your project is unfolded, you should have an executable file
wizardTestProject.exe
The EMBL-HD cluster uses slurm for job submission and resource allocation.
To run this project as a job on their computing resources, we should first set up a submission script, here is an example
#!/bin/bash #SBATCH -N 1 # number of nodes #SBATCH -n 1 # number of cores #SBATCH -o slurm.%N.%j.out # STDOUT #SBATCH -e slurm.%N.%j.err # STDERR #SBATCH --mail-type=END,FAIL # notifications for job done & fail #SBATCH --mail-user=alisterburt@gmail.com # send-to address #SBATCH -p gpu # select gpu usage #SBATCH --gres=gpu:1 # number of gpus (if using gpus) module load dynamo ./wizardTestProject.exe
Save this text file in the same directory as the executable file, then submit the job using sbatch
sbatch dynamo_test_gpu.sl
This should then confirm that the job has been submitted
Submitted batch job 65146729
Submitting jobs for CPU computing
The following is applicable for subtomogram averaging projects setup to be run in the "standalone" mode under computing environment.
Once your project is unfolded, you should have an executable file
wizardTestProject.exe
The EMBL-HD cluster uses slurm for job submission and resource allocation.
To run this project as a job on their computing resources, we should first set up a submission script, here is an example using 6 cores on 1 node
#!/bin/bash #SBATCH -N 1 # number of nodes #SBATCH -n 6 # number of cores #SBATCH -o slurm.%N.%j.out # STDOUT #SBATCH -e slurm.%N.%j.err # STDERR #SBATCH --mail-type=END,FAIL # notifications for job done & fail #SBATCH --mail-user=alisterburt@gmail.com # send-to address module load dynamo ./wizardTestProject.exe
Save this text file in the same directory as the executable file, then submit the job using sbatch
sbatch dynamo_test_cpu.sl
This should then confirm that the job has been submitted
Submitted batch job 65146755