Difference between revisions of "Starters guide (workshop version)"
|Line 72:||Line 72:|
==Initial model generation==
==Initial model generation==
We want to align the extracted tomograms to a common reference. an initial reference the ), an initial particles .
=== Loading particle in <tt>dgallery</tt> ===
=== Loading particle in <tt>dgallery</tt> ===
Revision as of 12:33, 17 August 2021
This tutorial shows how to manually pick particles in a tomogram, extract them, align and classify them. This tutorial is a shortened version of the original starters guide that was adapted for the use in Dynamo workshops.
- 1 Manual particle picking and Dynamo catalogues
- 2 Extracting particles from tomograms
- 3 Subtomogram alignment and averaging
- 4 Subtomogram classification
Manual particle picking and Dynamo catalogues
The Dynamo catalogues are databases that manage tomograms and link the tomographic data to the extracted particles. Start the catalogue manager by typing the command:
After the catalogue manager opens, we create 3 synthetic tomograms that include our particles (thermosomes) in the following way:
You can see the list of your tomograms and their metadata in a table in the bottom of your catalogue manager. When workin on your own projects, you can add tomograms to the catalogue by Catalogue -> Browse for new volume. The tomograms are situated in the directory testCatalogue. They contain a small amount of noise and a missing wedge associated to rotation around Y-axis. We added the information about location of the particles in the tomograms in an additional catalogue called testCatalogue_withmodels. If needed, it can be opened from the current folder with Catalogue -> look for local catalogues in the catalogue manager window.
Select the first tomogram in the list with a secondary mouse click and go to View volume>Full tomogram file in tomoslice.
The volume browser tomoslicer loads the entire tomograms into memory and allows making annotations to the regions of interest. The tomograms in this tutorial are small, and you can load them directly into memory. For real life tomograms, bear in mind that you will need to prebin the files before loading them into memory. Tomoslice has a simple set of controls and is suitable for visualization tasks that require oblique sections through the tomogram. It uses the same tool as other Dynamo browsers to keep track of your annotations: a pool of models. You might need to adjust the contrast (blue arrow). To move through the tomogram slices, you can either use the mouse wheel, click and drag the tomogram slice up and down (orange arrows), or move the position control left and right (orange box).
Picking and extracting particles in tomograms
Coordinates of picked particles are represented by data types called models. In the tomoslicer window go to Model Pool -> Create new model in pool (choose type) -> General. This is the simplest type of model where each clicked/model point corresponds to a single isolated particle. Now you can navigate up and down the tomogram, place the mouse on the center of a particle and press the [c] key on your keyboard to add a new model point. (Note the Help -> all hot keys options that lists the different the actions of the different keystrokes). Backspace button deletes the last clicked point (you can also use the right-click to delete single points).
After you are done clicking about 10 particles click on Active model > Update Crop points in the tomoslice window. ￼￼￼￼
Save the model into the catalogue by Active model -> Save active model into catalogue (disk) and close the slicer window.
Pick particles for tomograms 2 and 3. In total, there are around 30-40 particles in the generated dataset. When you open a new tomogram, make sure that you delete the pool of models from memory when asked. This will have no effect onto the models stored in disk, and it is necessary in order to ensure that you are not mixing models from different tomograms.
An alternate visualization: orthogonal projections
Clicking in the menu on Projection -> project full shown fragment along z and you will get a screen where the x-y projection of the tomogram is shown. Use the secondary click on it to launch the orthogonal views of x-z and y-z planes that traverse that point. These views can also be used to click particles, in case the standard view is not sufficient.
Extracting particles from tomograms
In the catalogue manager window select the rows for tomograms from which you want to extract particles from. You can either select them one by one with a mouse click and holding the [ctrl] key or by clicking the Select all button. Then, go to Crop Particles -> Open Volume List Manager.
A new window opens with all the models in the catalogue listed in the bottom. Pick (by checking boxes) the models of type general that you have just clicked. Click Create list and then Crop particle.
In the new window, change the data name to something meaningful, such as thermosomeParticles, change the sidelength from 32 to 48 and click start cropping.
After cropping is done, explore your cropped particles by clicking ddbrowse (in the window above). A new window opens where you can simply click on show.
You see a 2D representation (projections) of all cropped 3D subtomograms. Make sure your particles are well centered and fit the box.
Subtomogram alignment and averaging
Initial model generation
We want to align the extracted tomograms to a common reference. For that, we need an initial reference (which will be refined during the iterative alignment). Here, we create such an initial reference by manually align a few particles and average them.
Loading particle in dgallery
For initial model generation we will use dynamo_gallery to generate initial orientations for some of our particles. Type
>> dgallery(‘data’, ‘exampleData.Data’);
assuming you named exampleData your data folder while cropping.
- Load particles into memory by clicking “load” in the Load from disk field on the left of the window. Note that at first, the scene will show only one particle.
- Display some or all of the particles with the slider in the Shown particles field on the top of the window.
- Toggle between X- Y- and Z-views in the View: orientation field.
Manual alignment of particles in dgallery
Now for each particle you can specify the center with a mouse click and the “C” button. Thermosomes are barrel-like particles with a long axis, you therefore may also specify the “top” of the particle by clicking along the long axis of the particle (but still within the box) with a mouse click and click “N” (stands for North). Do it for all the particles alternating the X- Y- and Z- views. Non-aligned particles are marked red, once the particle has been left-clicked it adds to the section in memory and turns blue. To remove from selection right click on the box.
Selecting particles in dgallery
You can save selected tags and a corresponding dynamo table by clicking the “quick save” button in the Particle Section field on top of the window. It saves a quickbuffer.tbl and quickbuffer.tags to the hard drive that you will use later.
Averaging selected particles
To generate the average you need to “apply” the table on the particles. For this click the “average” button in the Particle selection field. It opens dynamo_average_GUI with a lot of controls, we only need output filename in the Averaged volume field - my_average.em. Click “compute average” in the bottom of the window; wait till it is done. Right click to the output filename and select the [view] simple 3d depiction of all slices option and examine it from X- Y- and Z- views. If you are not satisfied with the result close the window and refine your manual alignment / add more particles to the average.
Iterative alignment of subtomograms to the average is performed by dynamo projects that you can run in various high-performance computational environments. To have a project you need particles, an initial reference and a table which you have already generated. Start a WIZARD in the Dynamo command window or type
in Matlab (dcp stands for dynamo current project). Type a project name “drun1” and press Enter, Dynamo will generate the auxiliary files in a folder called drun1.
Provide input in the popup windows: folder with particles (and press OK), initial table particles/crop.tbl. You can also provide the quickbuffer.tbl which has the initial orientations from manual picking, however remember that this table contains only the selected particles. In cases when you don’t have a table you can generate a blank or a random tables that would be consistent with your particle folder but will lack the metadata i.e. real missing wedge values or source tomograms. Go to “template” and input my_average.em. Go to “Masks”, here you can provide several masks for different purposes, now we are only interested in the alignment mask. Use ellipsoid, you can specify semi-axis or go to “Mask editor” to make more sophisticated masks. Note that Dynamo by default uses Rossman correlation which eliminates the artefacts associated with hard (non-soft) mask. As usual you right click on the mask that you generate in order to open it with dview. For this type of simulated data “use default masks” will work.
Numerical parameters menu provides flexibility on how to run your alignment. You can specify different parameters for different rounds of refinement. You can select any value in the table with a mouse and click the “?” button on the top right of the parameter window, this will give you more info. For more parameters check boxes on the left. The most important parameters to consider are:
- Number of iterations in each round. Make 2 rounds with 3-4 iterations in each. The first round will be global search with a course angular step, the second – constrained search with a finer step. If you decided to use a table with prealigned particles (quickbuffer.tbl) you may skip the first round.
- Angular search ranges: cone aperture is the scan range for the first two Euler angles with a step of cone sampling starting at the actual angles stored in the table. 360 is the full scan range. Azimuth rotation range defines rotation range around the new vertical axis of the particle
- High- and lowpass values in Fourier voxels limit the used frequency range. It can be autotuned (below in this manual), here use some reasonable values like half-Nyquist.
- Particle dimensions – use the sidelength of your box; if you put lower value the particles will be downsampled for the particular round. This will speed up the process.
- Refine after each angular scan the search step reduces refine factor times, this repeats refine times. I.e. if your cone sampling is 10 deg, refine and refine factor and 3 and 2 then 10, 5, 2.5 and 1.25 degrees will be sampled. This is the optimization of angular search space. We typically use 2 and 2.
- Shift limits limits translation of particles from the center of the box (if shifts limiting way is 1) or from the previously estimated center (if shifts limiting way is 2). The previous estimates for the shifts and rotations are taken from input tables and are updated at the end of each iteration.
- Symmetry – if you know the symmetry of your protein it will speed up the convergence and result in higher resolution. Here we don’t make any assumptions.
- Threshold parameters and modus specify which particles contribute to the average at the end of each iteration. Our typical values are [1,2] and [0.5,5].
There are predefined parameter sets for Global and Local searches (Predefined profiles -> Global search / Refinement). After you set up the parameters press OK or close the window. ￼￼
Go to Computing environment in Wizard and select Matlab-CPU. Click “Info (nvidia)” it should output you a table to your Matlab window, your GPU identifiers are in the left column of the table. Put those into the GPU identifiers to the ‘identifiers” field.
Running the project
Go back to the Wizard window, click “check” and ‘unfold”, this will generate a runnable matlab script drun1.m. If you will modify your project and will want to run it again you need to re-unfold the project before each run.
- In a Matlab session you can just click on “run”.
- In a standalone project it's more convenient to open a new terminal, activate Dynamo on it and run the project just invoking the name of the project script that was generated by the unfolding step:
Open another Matlab window, activate dynamo, go to the same folder and monitor the progress of the execution by typing in >> dvstatus drun1. After several iterations you can look at intermediate results by typing:
>> ddb drun1:a –v
>>ddb drun1:a:ite=* -j c10
For more info on ddb type >> help ddb. In the end of your iterations you can monitor the results in the show results section of Wizard and should get something like this:
First generate a dataset which has 2 populations of particles. For this, type:
in your Matlab window. It generates two sets of particles, 8 particles in each and stores their real orientations in the directory data_classification.
The idea behind MRA is: iterative alignment of particles to several references and then each particle assigned to the reference where it fits the best. This way over iterative refinement similar particles will have higher correlation to similar references and will eventually group together. MRA is naturally implemented in Dynamo. For classification into N classes we will need N initial references and N initially identical tables that Dynamo will generate automatically. The references should be slightly different to start particle differentiation, in Dynamo we add a small amount of white noise to the overall average of all particles.
Creating MRA projects
Create a new project by typing “drun2” in the project name of Wizard set number of references to 2, set Swap particles on, put in the particle data.
Seeds for MRA projects
- Go to “table” and type data_classification/real.tbl into the “clone this” field and press “copy”, press “OK”.
- Go to “template” and clone data_classification /original_template.em with addition of some extra noise with amplitude 1.
- Initialize masks with the defaults parameters (large button on the bottom).
Go to “numerical parameters” and use Predefined profiles -> refinement. Increase angular search steps to 5, set a thresholing policy and press “OK”. Set computing environment to GPU (under Matlab), specify GPU identifiers.
Execution of projects
Check, unfold and run the project, this project will run twice longer as each particle will be aligned to two references. Monitor the progress of your project by
>> dvstatus drun2
and visualize the results as
>> ddb drun2:a:ref=* -j c5
You should get something like this – two averages with particle in one class larger than in the other.