Difference between revisions of "Prebinned tomograms"

From Dynamo
Jump to navigation Jump to search
(Created page with "Big tomograms are difficult to fit in memory. Even if they fit, their visualization can be difficult inside browsers as <tt>dtmsice</tt>, which load volumetric da...")
 
 
(13 intermediate revisions by the same user not shown)
Line 1: Line 1:
 +
Prebinned tomograms are files that contained binned versions of a tomogram. The filename of a prebinned tomogram must be derived from the original tomogram following a specific [[#convention | convention]].
 +
 +
Prebinning the tomograms in a [[catalogue]] before starting to annotate them in [[dtmslice]] '''is strongly adviced'''.
 +
 +
==Motivation==
 +
 +
===Quick access in viewers===
 
Big tomograms are difficult to fit in memory. Even if they fit, their visualization can be difficult inside browsers as [[dtmslice|<tt>dtmsice</tt>]], which load volumetric data in memory.
 
Big tomograms are difficult to fit in memory. Even if they fit, their visualization can be difficult inside browsers as [[dtmslice|<tt>dtmsice</tt>]], which load volumetric data in memory.
For this reason, it is frequently necessary to keep one or several binned versions of the same tomogram in disk.
+
For this reason, it is frequently necessary to keep one or several binned versions of the same tomogram in disk. Some programs in ''Dynamo'' can use a prebinned tomogram as a proxy for the full defined tomogram, keeping track of all the coordinate coventions. The most used one is <tt>dtmslice</tt>. A tomogram with  original sidelengths of 4000 x 4000 x 800 pixels will probably make the interaction with <tt>dtmslice</tt>sluggish and inconvenient. It's more convenient to send to <tt>dtmslice</tt> a 2 times binned tomogram, with a sidelength of 1000x1000x200 pixels, which can be comfortably navigated. Annotations will be automatically kept in the scale of the original tomogram. 
 +
 +
===Better depiction===
 +
Full resolution tomograms are quite noise. Binning them makes the navigation through a tomogram much easier, making structures more distinguishable for the bare eye. Skipping the prebinning of your tomograms and viewing them in full resolution will not only give you a rather painful experience with <tt>dtmslice</tt>, it will also most probably not give you any advantage and you will have to bin the tomograms anyway.
 +
 
 +
 
 +
===Storage considerations===
 +
Storage of prebinned tomograms is normally not problematic in terms of disk space: a 2x binned tomogram has 64 times less voxels than the original one.
 +
 
 +
== Convention ==
 +
 
 +
 
 +
=== Naming ===
 +
The file that represents a prebinend version of another one '''has''' to be named:
 +
 
 +
<tt>originalFileName_CatBinnedX.originalFileExtension</tt>
 +
 
 +
where X is the number of times that a binning with bin size 2x2x2 voxels has been applied.
 +
 
 +
For instance, the 2 times binned version of
 +
<tt>tomogram.mrc</tt>
 +
needs to be called
 +
<tt>tomogram_CatBinned2.mrc</tt>
 +
 
 +
Both tomograms need to reside in the same directory in order to let ''Dynamo'' find the prebinned version of the respective tomogram file.
 +
 
 +
=== Binning factor===
 +
The binning factor expresses the number of single binnings operated on one tomogram consecutively. Each binning represents the collapse of a cube of 2x2x2 voxels of the original file into a single voxel of the prebinned file. If the original tomogram had a sidelength of ''L'' voxels along one dimension, the 1x binned tomogram will have ''L/2'' voxels, the 2x binned one will have L/4, etc.
 +
 
 +
 
 +
== Creation ==
 +
Binning tomograms can be a time intensive procedure. The attempt of reading a full resolution tomogram into memory in one step can easily block the computer. Binning programs in ''Dynamo'' will proceed by loading into memory only small slabs (i.e., sets of correlative ''z''-slices) of the original volume, binning the slab and then writing the slab into a file representing the binned file.
 +
If the slab size is too small, the process can be rather long. If the slab size is too large, you risk crowding the memory. If the binning procedure is performed with many processors, each one will read a different slab in parallel, so that you need to take into account the total memory use in a given time.
 +
 
 +
We default slab size is 50, which uses to work reasonably in most occasions. The different methods to prebin the tomograms will provide different levels of control on the way the binning is computed. 
 +
 
 +
=== Through the catalogue ===
 +
 
 +
The [[dcm GUI| catalogue GUI]] allows to select a binning factor that will be applied to all the tomograms in the catalogue.
 +
If you add new tomograms and then use this option later, by default the tomograms that were already prebinned won't be processed again.
 +
[[File:PrebinningCatalogueMenu.png|thumb|center|400px|Binning options in the dcm GUI]]
 +
 
 +
The GUI allows to change the parameters used for the binning:
 +
 
 +
[[File:PrebinningChangeParameters.png|thumb|center|400px|BOpening the binning options]]
 +
 
 +
Your most important options are the slab size and the number of processors acting in parallel (called matlab workers). If the slab size is set to zero, then the size will be determined in runtime so as not to exceed the maximum number of Mb in memory. This number is indicated per processor.
 +
 
 +
[[File:PrebinningParameters.png|thumb|center|400px|editing the binning options]]
 +
 
 +
=== Through dtmslice ===
 +
 
 +
When you open a tomogram with dtmslice, you can ask to open the prebinned version:
 +
 
 +
<tt>dtmslice myTomogram.mrc -pb 2 -c myCatalogue</tt>
 +
 
 +
will open a prebinned version of the tomogram <tt>myTomogram.mrc</tt>. If the file <tt>myTomogram_CatBinned2.mrc</tt> does not exist, it will be created on the fly.
 +
The other flag, <tt>-c myCatalogue</tt> is merely to declare in which catalogue we are opening the tomogram, i.e., where the models that we create during this session will be created.
 +
 
 +
During the creation of prebinned tomograms, a slabsize of 50 is used by default.
 +
 +
=== Command line ===
 +
 
 +
==== Dedicated tools ====
 +
The easiest way is <tt>dpktomo.prebin.create</tt> where you just feed the <tt>filename</tt> of the high resolution file.
 +
 
 +
<tt>dpktomo.prebin.create(filename,2);</tt>
 +
 
 +
which will use a default slab size of 50 pixels and store the file with the right naming convention.
 +
 
 +
==== Generic tools ====
 +
The generic binning tool <tt>dpktomo.tools.bin</tt> allows for more flexibility in the input of options
 +
<tt>dpktomo.tools.bin('tomogram.mrc','tomogram_CatBinned2.mrc',2,'ss',100)</tt>
 +
 
 +
<tt>dpktomo.tools.bin('tomogram.mrc','tomogram_CatBinned2.mrc',2,'ss',16,'mw',10)</tt>
 +
 
 +
In this case, you need to pass the correct name of the target file. In the example, for a file called 'tomogram.mrc', we need to use convention for prebinned tomograms and specifiy the output as 'tomogram_CatBinned2.mrc'.
 +
 
 +
== Use ==
 +
 
 +
=== In tmslice ===
 +
 
 +
Prebinned tomograms are mostly used inside to feed [[dtmslice]]. In fact, using non-binned full resolution tomograms normally overwhelms this browser, forcing to use <tt>dpreview</tt> for on-the-fly binning or feeding only blocks into dtmslice. We however recommend to use prebinned tomograms. As they are computed only once (and incur not noticeable disk storage cost), they keep the workflow swift. Slices of 1000x1000 or real memory are comfortably moved by <tt>dtmslice</tt> in a MacBook Pro of a Linux workstation.
 +
 
 +
When you open a file by invoking one prebinned version, the models will be annotated with the right size (i.e., in voxel coordinates of the original tomogram).
 +
 
 +
==== Through the catalogue ====
 +
You need to secondary click on the <tt>index</tt> field of the tomogram inside the [[dcm GUI]]. If prebinned versions are avaialable, they will show. You can then click on the <tt>open with tmslice</tt> option for the binning factor that you wish.
 +
 
 +
[[File:PrebinningSelectContextCatalogue.png|thumb|center|400px|Opening the binning options]]
 +
 
 +
==== From the command line ====
 +
 
 +
You can open the tomograms using the flag 'pb' or 'prebinned' to detail that you want the prebinned version. If not available, it will be computed on the fly:
 +
 
 +
<tt>dtmslice myTomogram.mrc -pb 2</tt>

Latest revision as of 15:19, 28 April 2017

Prebinned tomograms are files that contained binned versions of a tomogram. The filename of a prebinned tomogram must be derived from the original tomogram following a specific convention.

Prebinning the tomograms in a catalogue before starting to annotate them in dtmslice is strongly adviced.

Motivation

Quick access in viewers

Big tomograms are difficult to fit in memory. Even if they fit, their visualization can be difficult inside browsers as dtmsice, which load volumetric data in memory. For this reason, it is frequently necessary to keep one or several binned versions of the same tomogram in disk. Some programs in Dynamo can use a prebinned tomogram as a proxy for the full defined tomogram, keeping track of all the coordinate coventions. The most used one is dtmslice. A tomogram with original sidelengths of 4000 x 4000 x 800 pixels will probably make the interaction with dtmslicesluggish and inconvenient. It's more convenient to send to dtmslice a 2 times binned tomogram, with a sidelength of 1000x1000x200 pixels, which can be comfortably navigated. Annotations will be automatically kept in the scale of the original tomogram.

Better depiction

Full resolution tomograms are quite noise. Binning them makes the navigation through a tomogram much easier, making structures more distinguishable for the bare eye. Skipping the prebinning of your tomograms and viewing them in full resolution will not only give you a rather painful experience with dtmslice, it will also most probably not give you any advantage and you will have to bin the tomograms anyway.


Storage considerations

Storage of prebinned tomograms is normally not problematic in terms of disk space: a 2x binned tomogram has 64 times less voxels than the original one.

Convention

Naming

The file that represents a prebinend version of another one has to be named:

originalFileName_CatBinnedX.originalFileExtension

where X is the number of times that a binning with bin size 2x2x2 voxels has been applied.

For instance, the 2 times binned version of

tomogram.mrc
needs to be called 
tomogram_CatBinned2.mrc 

Both tomograms need to reside in the same directory in order to let Dynamo find the prebinned version of the respective tomogram file.

Binning factor

The binning factor expresses the number of single binnings operated on one tomogram consecutively. Each binning represents the collapse of a cube of 2x2x2 voxels of the original file into a single voxel of the prebinned file. If the original tomogram had a sidelength of L voxels along one dimension, the 1x binned tomogram will have L/2 voxels, the 2x binned one will have L/4, etc.


Creation

Binning tomograms can be a time intensive procedure. The attempt of reading a full resolution tomogram into memory in one step can easily block the computer. Binning programs in Dynamo will proceed by loading into memory only small slabs (i.e., sets of correlative z-slices) of the original volume, binning the slab and then writing the slab into a file representing the binned file. If the slab size is too small, the process can be rather long. If the slab size is too large, you risk crowding the memory. If the binning procedure is performed with many processors, each one will read a different slab in parallel, so that you need to take into account the total memory use in a given time.

We default slab size is 50, which uses to work reasonably in most occasions. The different methods to prebin the tomograms will provide different levels of control on the way the binning is computed.

Through the catalogue

The catalogue GUI allows to select a binning factor that will be applied to all the tomograms in the catalogue. If you add new tomograms and then use this option later, by default the tomograms that were already prebinned won't be processed again.

Binning options in the dcm GUI

The GUI allows to change the parameters used for the binning:

BOpening the binning options

Your most important options are the slab size and the number of processors acting in parallel (called matlab workers). If the slab size is set to zero, then the size will be determined in runtime so as not to exceed the maximum number of Mb in memory. This number is indicated per processor.

editing the binning options

Through dtmslice

When you open a tomogram with dtmslice, you can ask to open the prebinned version:

dtmslice myTomogram.mrc -pb 2 -c myCatalogue

will open a prebinned version of the tomogram myTomogram.mrc. If the file myTomogram_CatBinned2.mrc does not exist, it will be created on the fly. The other flag, -c myCatalogue is merely to declare in which catalogue we are opening the tomogram, i.e., where the models that we create during this session will be created.

During the creation of prebinned tomograms, a slabsize of 50 is used by default.

Command line

Dedicated tools

The easiest way is dpktomo.prebin.create where you just feed the filename of the high resolution file.

dpktomo.prebin.create(filename,2);

which will use a default slab size of 50 pixels and store the file with the right naming convention.

Generic tools

The generic binning tool dpktomo.tools.bin allows for more flexibility in the input of options

dpktomo.tools.bin('tomogram.mrc','tomogram_CatBinned2.mrc',2,'ss',100)
dpktomo.tools.bin('tomogram.mrc','tomogram_CatBinned2.mrc',2,'ss',16,'mw',10)

In this case, you need to pass the correct name of the target file. In the example, for a file called 'tomogram.mrc', we need to use convention for prebinned tomograms and specifiy the output as 'tomogram_CatBinned2.mrc'.

Use

In tmslice

Prebinned tomograms are mostly used inside to feed dtmslice. In fact, using non-binned full resolution tomograms normally overwhelms this browser, forcing to use dpreview for on-the-fly binning or feeding only blocks into dtmslice. We however recommend to use prebinned tomograms. As they are computed only once (and incur not noticeable disk storage cost), they keep the workflow swift. Slices of 1000x1000 or real memory are comfortably moved by dtmslice in a MacBook Pro of a Linux workstation.

When you open a file by invoking one prebinned version, the models will be annotated with the right size (i.e., in voxel coordinates of the original tomogram).

Through the catalogue

You need to secondary click on the index field of the tomogram inside the dcm GUI. If prebinned versions are avaialable, they will show. You can then click on the open with tmslice option for the binning factor that you wish.

Opening the binning options

From the command line

You can open the tomograms using the flag 'pb' or 'prebinned' to detail that you want the prebinned version. If not available, it will be computed on the fly:

dtmslice myTomogram.mrc -pb 2