Difference between revisions of "Binning tomograms"

From Dynamo
Jump to navigation Jump to search
 
(6 intermediate revisions by the same user not shown)
Line 9: Line 9:
 
  <tt> dread myVolume.mrc a </tt>
 
  <tt> dread myVolume.mrc a </tt>
  
reads the file {{t|myVolume.mrc}} into the variable {{t|a}}. For a volume (), it takes 6 minutes on a MacBookPro (i5), during which the computer is close to frozen.  
+
reads the file {{t|myVolume.mrc}} into the variable {{t|a}}. For a volume {{t|4k x 4k x 400}}, it takes 6 minutes on a MacBookPro (i5), during which the computer is close to frozen.  
  
 
After reading the full tomogram, you still need to bin it. You can do it with the command {{docfunction|dynamo_bin|dbin}}
 
After reading the full tomogram, you still need to bin it. You can do it with the command {{docfunction|dynamo_bin|dbin}}
Line 15: Line 15:
 
  <tt> b=dbin(a,2); </tt>
 
  <tt> b=dbin(a,2); </tt>
  
... which will also take some time to complete.
+
... which will also take 12 minutes to complete. You can then write the result as
  
 +
<tt> dwrite(b,'binned.mrc'); </tt>
 +
 +
which will probably use less than one second, as {{t|b}} is 64 times smaller than {{t|a}}.
  
 
== Creating chunks ==
 
== Creating chunks ==
Line 24: Line 27:
 
An example of use is:
 
An example of use is:
  
  <tt>o = dpktomo.partition.bin(myFile.mrc,1,'of','binned.mrc','slabSize',100);</tt>
+
  <tt>o = dpktomo.partition.bin(myFile.mrc,2,'of','binned.mrc','slabSize',100);</tt>
 +
 
 +
which needed 3 minutes for the example above.
 +
 
 +
Here, {{t|slabSize}} determines that {{t|myFile.mrc}} will be read in vertical slabs, each one of 100 pixels.  {{t|of}} determines the output file. This procedure can be accelerated using the flag {{t|mw}} to engage several Matlab workers in parallet. However, this will only make sense if the total  memory occupied by all the slabs simultaneously in memory in a given time fits in the RAM of the machine.
 +
 
 +
The extension of the output file needs to be {{t|mrc}}. Target precision can be chosen with flag {{t|type}} (i.e, can be set to  {{t|float}} or {{t|double}}).
 +
 
 +
=== Controlling the size of the chunk===
 +
 
 +
You can use the flag {{t|mb}} to set a maximum size in Megabytes for each slab. This is a comfortable way to create a chunk partition, as it does not require knowing a priori the size of the original tomogram.
  
Here, {{t|slabSize}} determines that {{t|myFile.mrc}} will be read in vertical slabs, each one of 100 pixels.  {{t|of}} determines the output file. This procedure can be accelerated using the flag {{t|mw}} to engage several Matlab workers. However, this will only make sense if the total memory of 
+
=== Binning tomograms for catalogue ===
 +
 +
The [[Catalogue]] uses an [[Viewing_tomograms#Prebinned_tomograms |specific convention]] that allows to store for each catalogued tomogram a binned version (or several). This allows a quicker visualization.
 +
You can create them directly with the <tt>dpktomo.partition.bin</tt> command, by using the option {{t|cat}} on the {{t|outputFile}} flag:
  
The extension needs to be {{t|mrc}}
+
<tt>o = dpktomo.partition.bin(myFile.mrc,2,'of','cat','mb',1000);</tt>

Latest revision as of 09:49, 20 May 2016

Tomograms can be binned in different ways from the command line.

Loading the full tomogram into memory

This is the simplest approach: read the full tomogram file into memory, bin it, and write the result onto a file. This is useful for small tomograms, or when working in machines with large memory, but it will probably not work when the tomogram is too big to fit into your memory.

For instance the order:

 dread myVolume.mrc a 

reads the file myVolume.mrc into the variable a. For a volume 4k x 4k x 400, it takes 6 minutes on a MacBookPro (i5), during which the computer is close to frozen.

After reading the full tomogram, you still need to bin it. You can do it with the command dbin

 b=dbin(a,2); 

... which will also take 12 minutes to complete. You can then write the result as

 dwrite(b,'binned.mrc'); 

which will probably use less than one second, as b is 64 times smaller than a.

Creating chunks

An alternative way is to create several chunks that are read, binned and stored independently. The final (binned) volume is integrated after the different chunks have been processed separately. This happens transparently for the users.

An example of use is:

o = dpktomo.partition.bin(myFile.mrc,2,'of','binned.mrc','slabSize',100);

which needed 3 minutes for the example above.

Here, slabSize determines that myFile.mrc will be read in vertical slabs, each one of 100 pixels. of determines the output file. This procedure can be accelerated using the flag mw to engage several Matlab workers in parallet. However, this will only make sense if the total memory occupied by all the slabs simultaneously in memory in a given time fits in the RAM of the machine.

The extension of the output file needs to be mrc. Target precision can be chosen with flag type (i.e, can be set to float or double).

Controlling the size of the chunk

You can use the flag mb to set a maximum size in Megabytes for each slab. This is a comfortable way to create a chunk partition, as it does not require knowing a priori the size of the original tomogram.

Binning tomograms for catalogue

The Catalogue uses an specific convention that allows to store for each catalogued tomogram a binned version (or several). This allows a quicker visualization. You can create them directly with the dpktomo.partition.bin command, by using the option cat on the outputFile flag:

o = dpktomo.partition.bin(myFile.mrc,2,'of','cat','mb',1000);