Walkthrough for template matching
Dynamo includes a set of tools for location of particles inside tomograms. The most simple one is template matching.
- 1 Template matching
- 2 Data set
- 3 Creating a template
- 4 Creating a cross correlation process
- 5 Locating cross correlation peaks
- 6 Visual evaluation of results
- 7 Getting the tables back into the catalogue
In this technique, a template representing a molecule of interest is systematically cross-correlated against a tomogram, producing a cross-correlation map of the tomogram. Each pixel in this map represents a score assigned the corresponding pixel in the tomogram map. This score measures the similarity of the neighbourhood of the tomogram pixel to the used template. This similarity is measured exclusively inside a mask.
The tomogram contains a buffer with T20S proteasome on a holey carbon grid collected on a Krios + K2. Original pixelsize was 1.76 angstroms. The tomogram provided here has bin binned twice (yielding thus 7.04 ang), defocus is 4.4 microns, no CTF correction, no energy filter.
The tomogram has been kindly provided by Alex Noble, from the New York Structural Biology Center. Data collection was performed using Leginon and Appion-Protomo at the Simons Electron Microscopy Center and National Resource for Automated Molecular Microscopy located at the New York Structural Biology Center, supported by grants from the Simons Foundation (349247), NYSTAR, and the NIH National Institute of General Medical Sciences (GM103310) with additional support from Agouron Institute [Grant Number: F00316].
Getting the tomogram
Visualizing the tomogram
We can get a first glance on how the tomogram looks like:
dtmshow -otf t20s.mrc
to use the on-the-fly access to the slice shown at each given moment, or
v = dread('t20s.mrc');dtmshow(v);
to preload the full tomogram into a memory variable (arbitrarily called v). In either option, you will see, the proteasomes are densely packed in an layer. The layer is slightly oblique, what can be seen browsing through z or y.
Navigating on-the-fly you'll see that transitions in y are slower than transitions in z, because all the pixels of the same slice are stored sequentially in the disk.
Estimation of the missing wedge
Creating a template
There are different strategies to create the first template. In the case where the general shape of each protein is roughly recognisable by eye, it is not difficult to just crop and align manually some of the particles. When this is not possible, you have the option of using a density map that mimics the general topology of your protein.
Through manual alignment
Manual selection of some particles
We use for this our tool dtmslice. As the tomogram is provided is fairly small you can probably just open it without any further binning.
dtmslice t20s.mrc -cc ct20
Manual alignment of some particles
Averaging manually aligned particles
Creating a tight mask
Cropping the borders of a template
Through geometrical shapes
Alternatively, you can use dynamo_mask or dynamo_tube to create a synthetic model.
Creating a cross correlation process
ptm = dynamo_match(fileFull,'average48.em','mask','maskData48.em',.... 'outputFolder','cs30',... 'ytilt',[-39,36],'sc',[1000,1000,300],'cr',360,'cs',30,'bin',1);
------------------------------------------------------------ Template matching process. computing in CPU Output folder: cs30.TM A total of 1 tiles have been created - Mb per tile (reading) : 724.19 - Mb per tile (operation) : 663.84 ... initializing output elements Preparing to run on 1 blocks Running on single processor Computing tile 1 -Range (original block): x[1:958] y[1:926] z[1:214] -Range (binned block) : x[1:479] y[1:463] z[1:107] ... tile 1 finished in 387 seconds (8 for setup; 7.73 per triplet) [ok] ... template matching process completed upto creation of cross correlation you can proceed to peak location and particle extraction. ------------------------------------------------------------
The Process object
The return of the function <notwiki>ptm</nowiki> is an object of class XXX. In short, this is an object created
Considerations when creating a process
Locating cross correlation peaks
Looking at the cross correlation map
Looking at the cross correlation profile
Extracting a table
A table can be extracted through:
myTable = pff.peaks.computeTable('mcc',0.378);
Visual evaluation of results
Looking at table positions
Looking at cropped particles
We can check how the individual particles look like on a gallery modus:
This order just opens ddbrowse. We are using here the support object peaks, but this command is equivalent to just invoking ddbrowse
Looking at averages
o = pff.peaks.average();
Getting the tables back into the catalogue
You don't need to operate with the catalogue: you could just use dtcrop on the just computed table to extract particles from the original tomogram into a data folder. However, cataloguing has to advantages:
- Eases keeping track of all the steps you've performed, specially if you are going to use several tomograms, and..
- allows you to visualize the peaks on the tomogram interactively, so that you can delete false positives or add peaks that were not located by the template matching.
Rescaling the table
Remember that the table that we are working with has the scale of the auxiliary binned volume that has been created along with the CC matrix. In the catalogue, we want to work with the particles in their original scale.
tableOriginalScale = dynamo_table_rescale(myTable,'factor',2);
In the syntax of dynamo_table_rescale, the factor is expressed in terms of how many times is the apix in the original table bigger than in the target table to be computed. In our case, the target table was computed with an apix of 14 Angstrom (one time binned in relation to the tomogram t20s.rec, with an apix of 7.04A). The factor is thus 2. In case of doubt, it is convenient to just run
to check the extent of the entries in the columns 24, 25 and 26, which are positions (in pixels) of the particles indexed by the table. Now, if we write the upscaled table into a file
we can just put this table back into the catalogue through:
dmimport -t peaks.tbl -c ct20s -i 1 -mn ccpeaks
which will create a model called ccpeaks and assign it to the first volume (-i 1) in the catalogue ct20s. You can check it by writing:
dcm -c ct20s -i 1 -l m
which asks for a listing of models (-l m) in the first volume of the catalogue.