Binning tomograms

From Dynamo
Revision as of 09:49, 20 May 2016 by Daniel Castaño (talk | contribs) (→‎Creating chunks)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Tomograms can be binned in different ways from the command line.

Loading the full tomogram into memory

This is the simplest approach: read the full tomogram file into memory, bin it, and write the result onto a file. This is useful for small tomograms, or when working in machines with large memory, but it will probably not work when the tomogram is too big to fit into your memory.

For instance the order:

 dread myVolume.mrc a 

reads the file myVolume.mrc into the variable a. For a volume 4k x 4k x 400, it takes 6 minutes on a MacBookPro (i5), during which the computer is close to frozen.

After reading the full tomogram, you still need to bin it. You can do it with the command dbin


... which will also take 12 minutes to complete. You can then write the result as


which will probably use less than one second, as b is 64 times smaller than a.

Creating chunks

An alternative way is to create several chunks that are read, binned and stored independently. The final (binned) volume is integrated after the different chunks have been processed separately. This happens transparently for the users.

An example of use is:

o = dpktomo.partition.bin(myFile.mrc,2,'of','binned.mrc','slabSize',100);

which needed 3 minutes for the example above.

Here, slabSize determines that myFile.mrc will be read in vertical slabs, each one of 100 pixels. of determines the output file. This procedure can be accelerated using the flag mw to engage several Matlab workers in parallet. However, this will only make sense if the total memory occupied by all the slabs simultaneously in memory in a given time fits in the RAM of the machine.

The extension of the output file needs to be mrc. Target precision can be chosen with flag type (i.e, can be set to float or double).

Controlling the size of the chunk

You can use the flag mb to set a maximum size in Megabytes for each slab. This is a comfortable way to create a chunk partition, as it does not require knowing a priori the size of the original tomogram.

Binning tomograms for catalogue

The Catalogue uses an specific convention that allows to store for each catalogued tomogram a binned version (or several). This allows a quicker visualization. You can create them directly with the dpktomo.partition.bin command, by using the option cat on the outputFile flag:

o = dpktomo.partition.bin(myFile.mrc,2,'of','cat','mb',1000);