Difference between revisions of "Particle File List"

From Dynamo
Jump to navigation Jump to search
 
(12 intermediate revisions by the same user not shown)
Line 1: Line 1:
The Particle File List is a [[Generic data containers | generic data container]] that can be used an alternative for [[classic ''Dynamo'' data folders |Data folder]].  
+
The Particle File List is a [[Generic data containers | generic data container]] that can be used an alternative for [[Data folder | classic ''Dynamo'' data folders]].  
 
This object  contains a list of tags and a list of particle  files, both lists having equal length.Thus, the ''tag'' of. a file is defined by the value of  <tt>tag</tt> property inside the object, '''not''' by the name of particle file (unlike in a classic Dynamo folder).
 
This object  contains a list of tags and a list of particle  files, both lists having equal length.Thus, the ''tag'' of. a file is defined by the value of  <tt>tag</tt> property inside the object, '''not''' by the name of particle file (unlike in a classic Dynamo folder).
The <tt>ParticleListFile</tt> object is just a wrapper on a list of files: the particle files still need to be stored somewhere. They can be stored in any ''Dynamo'' storage folder (as classical ''Dynamo'' folders or [['dBox folder | 'dBoxes folders'' ]] ), but this restriction is not necessary. Any  
+
The <tt>ParticleListFile</tt> object is just a wrapper on a list of files: the particle files still need to be stored somewhere. They can be stored in any ''Dynamo'' storage folder (as classical ''Dynamo'' folders or [[ 'dBoxes folders'' |dBox folder ]] ), but this restriction is not necessary. Any set of filenames can be used, as long as the files exist and contain particles of the same size
  
 
= When to use it =
 
= When to use it =
 +
 +
This data container is specially useful when your total data set comprises sets of particles that have been cropped and processed separately. For instance, when you have. a system to align all particles of a single model (filament, vesicle, etc) in a tomogram, so you just extract particles from a single model into a single data folder. This is a sensible approach, as it allows you to "play" with particles of separate models comfortably. In this setting, you end up with many particle folders (and tables), that cannot be used for common processing without further formatting.  A similar situation happens when you start analysing your data set and you include new tomograms and models on a later stage.
 +
 +
The classical ways to deal with this situation would be:
 +
* recrop again all particles from all tomograms into a new, single data folder, or
 +
* use <tt>dynamo_data_merge</tt> to fuse existing data folder into a single one.
 +
Both approaches do work, but they are prone to errors, specially when metadata (i.e. tables) of the individual data folders must be reformatted along.
  
 
= The particle list star file =
 
= The particle list star file =
Line 43: Line 50:
  
 
  <tt> plfNew.fillFromStarFile('test.star');</tt>
 
  <tt> plfNew.fillFromStarFile('test.star');</tt>
 +
 +
Both steps (creation of object and reading of contents) can be done in one as:
 +
 +
<tt> plfNew = dpkdata.containers.ParticleListFile.read('test.star');</tt>
  
 
Now, the tag and particleFile properties should be filled:
 
Now, the tag and particleFile properties should be filled:
  
  <nowiki>>> plfNew
+
  <nowiki> >> plfNew
  
 
plfNew =  
 
plfNew =  
Line 65: Line 76:
  
 
You can get a file representing a <tt>ParticleFileList</tt> object simply with the command:
 
You can get a file representing a <tt>ParticleFileList</tt> object simply with the command:
  <tt>dpkdata.containers.ParticleListFile.data2starFile(<data>,<star file>);</tt>
+
  <tt> dpkdata.containers.ParticleListFile.data2starFile( inputDataFolder> , outputStarFile); </tt>
  
 
This file can be used as input for  <tt>average</tt> or an alignment project. You can use a <tt>table</tt> flag in order to pass a table.  
 
This file can be used as input for  <tt>average</tt> or an alignment project. You can use a <tt>table</tt> flag in order to pass a table.  
Line 74: Line 85:
 
The command for this functionality is:
 
The command for this functionality is:
 
  <tt>dpkdata.containers.ParticleListFile.mergeDataFolders</tt>
 
  <tt>dpkdata.containers.ParticleListFile.mergeDataFolders</tt>
which is be fed with a cell array of names of data folders. Here, you can pass along a cell array of tables (one per data folder) using the flag <tt>table</tt>
+
 
 +
which is to be fed with a cell array of names of data folders. Here, you can pass along a cell array of tables (one per data folder) using the flag <tt>table</tt>
  
 
= Example of use =
 
= Example of use =
Line 108: Line 120:
  
 
% we can pass a table attached to each data folder
 
% we can pass a table attached to each data folder
plf = dpkdata.containers.ParticleListFile.mergeDataFolders(dataname,'tables',tablename);
+
plf = dpkdata.containers.ParticleListFile.mergeDataFolders(dataname,'tables',tableName);
  
 
% The storage object keeps track of the accumulated metadata
 
% The storage object keeps track of the accumulated metadata
Line 116: Line 128:
  
 
% computes the average in the classical way
 
% computes the average in the classical way
ws = daverage(plf,'t',tMerged); </nowki>
+
ws = daverage(plf,'t',tMerged); </nowiki>
 +
 
 +
% creates an star file that can be used as a data container in any Dynamo program (alignment projects, classification projects, etc...)
 +
plf.writeFile('particles.star');
 +
 
 +
= Metadata =
 +
 
 +
For advanced users, this object includes a <tt>metadata</tt> property, allowing you to have data and metadata in one single object. This is a commodity to be used with caution, as by default ''Dynamo'' will always consider the metadata of a external [[''classical table'' | table]].  For instance:
 +
<tt>average(plf,'table',externalTable);</tt>
 +
will align the particles referred by <tt>plf</tt> with the alignment parameters inside the table <tt> externalTable</tt>, and not with the table contained in the internal <tt>metadata</tt> of <tt>plf</tt>.
 +
 
 +
== Classical ''Dynamo'' tables ==
 +
Classical tables can be set into the <tt>ParticleListFile</tt> with:
 +
<tt>plf.setTable(<classical table>);</tt>
 +
where the table <tt><classical table></tt> needs to refer the same tags as referred by the object <tt>plf</tt>.
 +
Conversely, you can extract a classical ''Dynamo'' table with:
 +
<tt>tNumeric = plf.metadata.table.getClassicalTable();</tt>

Latest revision as of 12:01, 30 November 2020

The Particle File List is a generic data container that can be used an alternative for classic Dynamo data folders. This object contains a list of tags and a list of particle files, both lists having equal length.Thus, the tag of. a file is defined by the value of tag property inside the object, not by the name of particle file (unlike in a classic Dynamo folder). The ParticleListFile object is just a wrapper on a list of files: the particle files still need to be stored somewhere. They can be stored in any Dynamo storage folder (as classical Dynamo folders or dBox folder ), but this restriction is not necessary. Any set of filenames can be used, as long as the files exist and contain particles of the same size

When to use it

This data container is specially useful when your total data set comprises sets of particles that have been cropped and processed separately. For instance, when you have. a system to align all particles of a single model (filament, vesicle, etc) in a tomogram, so you just extract particles from a single model into a single data folder. This is a sensible approach, as it allows you to "play" with particles of separate models comfortably. In this setting, you end up with many particle folders (and tables), that cannot be used for common processing without further formatting. A similar situation happens when you start analysing your data set and you include new tomograms and models on a later stage.

The classical ways to deal with this situation would be:

* recrop again all particles from all tomograms into a new, single data folder, or 
* use dynamo_data_merge to fuse existing data folder into a single one.

Both approaches do work, but they are prone to errors, specially when metadata (i.e. tables) of the individual data folders must be reformatted along.

The particle list star file

An object is something that lives in memory and can thus be operated upon. In most Dynamo commands you can use the object "alive" in memory or its representation in disk. An object of type ParticleListFile can be saved into memory through an star file. For instance, if you have in memory an object plf of this class, the command:

plf.writeFile('test.star')

will create a star file with this structure:

data_
loop_
_tag
_particleFile
1  <absolute path>/particle_00008.em
2  <absolute path>particle_00009.em

Conversely, if you have a star file in your disk, you can create a ParticleListFile object out of it first initializing the object:

plfNew = dpkdata.containers.ParticleListFile();

where the name plfNew is arbirary. You can check what is inside by just typing the name of object variable that you choose, in this case plfNew without a semicolon at the end, i.e.:

>> plfNew

plfNew = 

  ParticleListFile with properties:

                   tag: []
          particleFile: []
                 facts: []
                source: []
                  type: []
              metadata: [1×1 dpkdata.aux.metadata.StarTable]
    why_data_not_valid: 'not initialized'

Then filling it with a star file

 plfNew.fillFromStarFile('test.star');

Both steps (creation of object and reading of contents) can be done in one as:

 plfNew = dpkdata.containers.ParticleListFile.read('test.star');

Now, the tag and particleFile properties should be filled:

 >> plfNew

plfNew = 

  ParticleListFile with properties:

                   tag: [24×1 double]
          particleFile: [24×1 string]
                 facts: [1×1 struct]
                source: []
                  type: []
              metadata: [1×1 dpkdata.aux.metadata.StarTable]
    why_data_not_valid: []

Converting old data folders

Converting one single data folder

You can get a file representing a ParticleFileList object simply with the command:

 dpkdata.containers.ParticleListFile.data2starFile( inputDataFolder> , outputStarFile); 

This file can be used as input for average or an alignment project. You can use a table flag in order to pass a table. With this format, the tags of the particles are not changed.

Merging several data folders

The command for this functionality is:

dpkdata.containers.ParticleListFile.mergeDataFolders

which is to be fed with a cell array of names of data folders. Here, you can pass along a cell array of tables (one per data folder) using the flag table

Example of use

The script below is In your Dynamo distribution under

dpkdata.examples.setOfDataFoldersIntoParticleListFile.

It creates three classical data folders, each with its own 8 particles, and a table to refer them. This table contains the actual alignment parameters of the particles in the respective data folder. All particles of all files are indexed into a single new object that keeps track of data folders and tables, renaming the tags coherently. Note that initial particle files are not moved or copied; the new object just references them. In this example, it is fed into average, but if can be input into a project (after saving the object into a star file).


% create three different data folders
N = 3;
testNameRoot = 'testFolder';
for i=1:N
   testName{i} = sprintf('%s%03d',testNameRoot,i)
   
   % each folder has the same particle tags (1 to 8), but the same 
   % tag on each folder is a different particle
   dynamo_tutorial(testName{i},'real_random',1,'linear_tags',1); 
end

%% 
% keeps the positions of data and tables
for i=1:N   
   dataname{i}  = mbparse.files.absolutePath.getAbsolutePath([testName{i},filesep,'/data']);
   tableName{i} = mbparse.files.absolutePath.getAbsolutePath([testName{i},filesep,'/real.tbl']);
end

%
% merges all data folders into one single ParticleListFile object
%

% we can pass a table attached to each data folder
plf = dpkdata.containers.ParticleListFile.mergeDataFolders(dataname,'tables',tableName);

% The storage object keeps track of the accumulated metadata

% gets a table of the old Dynamo style (just a matrix of numbers)
tMerged = plf.metadata.table.getClassicalTable();

% computes the average in the classical way
ws = daverage(plf,'t',tMerged); 

% creates an star file that can be used as a data container in any Dynamo program (alignment projects, classification projects, etc...) plf.writeFile('particles.star');

Metadata

For advanced users, this object includes a metadata property, allowing you to have data and metadata in one single object. This is a commodity to be used with caution, as by default Dynamo will always consider the metadata of a external table. For instance:

average(plf,'table',externalTable);

will align the particles referred by plf with the alignment parameters inside the table externalTable, and not with the table contained in the internal metadata of plf.

Classical Dynamo tables

Classical tables can be set into the ParticleListFile with:

plf.setTable(<classical table>);

where the table <classical table> needs to refer the same tags as referred by the object plf. Conversely, you can extract a classical Dynamo table with:

tNumeric = plf.metadata.table.getClassicalTable();