Difference between revisions of "Tips for management of tomographic data sets"

From Dynamo
Jump to navigation Jump to search
Line 2: Line 2:
 
The guidelines sketched here are a walkthrough on how we organize our tomographic data sets in the BioEM lab of the University of Basel.   
 
The guidelines sketched here are a walkthrough on how we organize our tomographic data sets in the BioEM lab of the University of Basel.   
  
We use a series of conventions on the way we storage of the raw data (i.e.) tilt series that makes easy to keep track on the processing steps already performed (alignment, reconstruction), the location of the intermediate results, and their binding into a ''Dynamo'' [[catalogue]].
+
We use a series of conventions for the storage of the raw data (i.e.) tilt series, in order to make easy to keep track on the processing steps already performed (alignment, reconstruction), the location of the intermediate results, and their binding into a ''Dynamo'' [[catalogue]]. The idea is to keep and uniform naming convention that allows to comfortably locate the sought files and folders with simple Linux commands. 
  
The raw data is stored in a tree below a ''repository'' folder, which will contain all the data (tilt series and tomograms) related to the tomography projects.  The repository folder is created with the structure:
+
In this example, we assume that you are using <tt>imod</tt> for tomographic alignment and reconstruction of each tilt series. The raw data is stored in a tree below a ''repository'' folder, which will contain all the data (tilt series and tomograms) related to the tomography projects.  The repository folder is created with the structure:
  
  
Line 10: Line 10:
 
<repository>/ctf</nowiki>
 
<repository>/ctf</nowiki>
  
Here, <tt><repository></tt> is some location in your file system, probably a file share destined for massive storage (at least Tbs)
+
Here, <tt><repository></tt> is some location in your file system, probably a file share destined for massive storage (at least Tbs). We often call this location <tt>'repository'</tt> (literally).
  
 
== Organizing the tilt series stacks==
 
== Organizing the tilt series stacks==
Line 35: Line 35:
 
The contents of the <tt>raw</tt> folder may vary from project to project. They might be unaligned movies, gain reference files, or any intermediate files. Our acquisition system Focus delivers for each tilt series:
 
The contents of the <tt>raw</tt> folder may vary from project to project. They might be unaligned movies, gain reference files, or any intermediate files. Our acquisition system Focus delivers for each tilt series:
 
* a tilt series file, i.e., a stack file in mrc format (and .mcrs extension).  
 
* a tilt series file, i.e., a stack file in mrc format (and .mcrs extension).  
* a metadata file <tt>.star</tt> file.
+
* a metadata file <tt>.star</tt> file.
 +
* a <tt>.tlt</tt> file with the values of the tilt angles of each micrograph contained in the stack.
 
   
 
   
 
When we use <tt>Imod</tt> for reconstruction, we immediately transfer the stack to the <tt>imod</tt> folder. By convention we keep the batch and tilt series numbers and change the extension to <tt>.st</tt>  
 
When we use <tt>Imod</tt> for reconstruction, we immediately transfer the stack to the <tt>imod</tt> folder. By convention we keep the batch and tilt series numbers and change the extension to <tt>.st</tt>  
Line 42: Line 43:
 
==Creating the reconstructions==
 
==Creating the reconstructions==
  
If you use etomo to define a reconstruction workflow,
+
After using <tt>etomo</tt> in the folder that contains the tilt series, you should end up with a full sized tomogram in the corresponding <tt>imod</tt> folder. We name  this tomogram <tt>bXXXtsYYYFull.mrc</tt>
 +
<nowiki><repository>/data/b002/ts013/imod/b002ts013Full.mrc</nowiki>
  
==Creating the first catalogue==  
+
=== Reconstruction Post processing===
 +
Importantly, we usually want to have tomograms in which ''z'' represents the direction of the electron beam and ''y'' represents the direction of the tilt axis. This usually require ''flipping'' the tomogram. You can do it already in the <tt>etomo</tt> GUI, or with the imod command line, or with [[Operations on volumes|''Dynamo'' commands]] or later in the ''Dynamo'' GUI.
  
 +
In this walkthrough, let's assume that the files
 +
 +
<nowiki><repository>/data/bYYY/tsYYY/imod/bXXXtsYYYFull.mrc</nowiki>
 +
 +
have already been flipped before entering them into the ''Dynamo'' [[catalogue]] that we will generate later.
 +
 +
==Creating a catalogue==
 +
 +
With the convention that we have used so far, you can use in Linux  the order
 +
<tt>ls -d <repository>/data/b*/ts*/imod/*Full.rec</tt>
 +
to list into the screen your available tomograms, each with its absolute path.
 +
 +
Then, in order to create a catalogue you just need to create a text file with them
 +
<tt>ls -d <repository>/data/b*/ts*/imod/*Full.rec >> list.vll </tt>.
 +
 +
The <tt>list.vll</tt> we jsut created is the most basic syntax of a [[volume list file]], but we can already use it to generate a catalogue of tomograms. Inside Matlab (or the [[standalone]]), just type
 +
 +
<tt>dcm -create -fromvll list.vll</tt>
 +
 +
and a [[catalogue]] will be generated in the current location.
 +
 +
 +
===Prebinning of catalogued tomograms===
 +
 +
 +
 +
===Visualizing the tomograms===
  
==Prebinning of catalogue volumes==
+
==== Gallery of tomograms ====
  
==Visualizing the tomograms==
+
=== Individual tomograms ====
 +
You need to access each tomogram individually.

Revision as of 13:42, 28 April 2017

The guidelines sketched here are a walkthrough on how we organize our tomographic data sets in the BioEM lab of the University of Basel.

We use a series of conventions for the storage of the raw data (i.e.) tilt series, in order to make easy to keep track on the processing steps already performed (alignment, reconstruction), the location of the intermediate results, and their binding into a Dynamo catalogue. The idea is to keep and uniform naming convention that allows to comfortably locate the sought files and folders with simple Linux commands.

In this example, we assume that you are using imod for tomographic alignment and reconstruction of each tilt series. The raw data is stored in a tree below a repository folder, which will contain all the data (tilt series and tomograms) related to the tomography projects. The repository folder is created with the structure:


<repository>/data
<repository>/ctf

Here, <repository> is some location in your file system, probably a file share destined for massive storage (at least Tbs). We often call this location 'repository' (literally).

Organizing the tilt series stacks

Batches

We organize our tilt series in batches. Each batch represents customarily tilt series acquired in the same conditions or in the same session, but there is no strict rule. Each batch will have its own folder under the data repository folder. The convention for the batch folder is a 'b' character followed with an integer (zero-padded to three figures).

<repository>/data/b001
<repository>/data/b002
<repository>/data/b015
...

Note that the batch numbers don't need to be consecutive.

Tilt series folder

Inside each batch folder we create a separate folder for each tilt series. The convention for the batch folder is a 'ts' character followed with an integer (zero-padded to three figures), for instance

<repository>/data/b001/ts012
<repository>/data/b001/ts016
...

Inside each tilt series folder we typically define the folders raw and imod

<repository>/data/bXXX/tsYYY/raw
<repository>/data/bXXX/tsYYY/imod

The contents of the raw folder may vary from project to project. They might be unaligned movies, gain reference files, or any intermediate files. Our acquisition system Focus delivers for each tilt series:

  • a tilt series file, i.e., a stack file in mrc format (and .mcrs extension).
  • a metadata file .star file.
  • a .tlt file with the values of the tilt angles of each micrograph contained in the stack.

When we use Imod for reconstruction, we immediately transfer the stack to the imod folder. By convention we keep the batch and tilt series numbers and change the extension to .st

<repository>/data/b002/ts013/imod/b002ts013.st

Creating the reconstructions

After using etomo in the folder that contains the tilt series, you should end up with a full sized tomogram in the corresponding imod folder. We name this tomogram bXXXtsYYYFull.mrc <repository>/data/b002/ts013/imod/b002ts013Full.mrc

Reconstruction Post processing

Importantly, we usually want to have tomograms in which z represents the direction of the electron beam and y represents the direction of the tilt axis. This usually require flipping the tomogram. You can do it already in the etomo GUI, or with the imod command line, or with Dynamo commands or later in the Dynamo GUI.

In this walkthrough, let's assume that the files

<repository>/data/bYYY/tsYYY/imod/bXXXtsYYYFull.mrc

have already been flipped before entering them into the Dynamo catalogue that we will generate later.

Creating a catalogue

With the convention that we have used so far, you can use in Linux the order

ls -d <repository>/data/b*/ts*/imod/*Full.rec

to list into the screen your available tomograms, each with its absolute path.

Then, in order to create a catalogue you just need to create a text file with them

ls -d <repository>/data/b*/ts*/imod/*Full.rec >> list.vll .

The list.vll we jsut created is the most basic syntax of a volume list file, but we can already use it to generate a catalogue of tomograms. Inside Matlab (or the standalone), just type

dcm -create -fromvll list.vll

and a catalogue will be generated in the current location.


Prebinning of catalogued tomograms

Visualizing the tomograms

Gallery of tomograms

Individual tomograms =

You need to access each tomogram individually.