Difference between revisions of "DBox folder"

From Dynamo
Jump to navigation Jump to search
(Created page with "The ''dBox'' folder is an alternate way to store particle files. When you have, say, 100K files, it is a bad idea to put all of them on the same folder. Instead of leaving...")
 
 
(5 intermediate revisions by the same user not shown)
Line 3: Line 3:
 
Instead of leaving all particles in the same data folder, inside the dBox folder there is a subdirectory hierarchy that allows saving the particles in different subfolders. This happens in a way totally transparent for the user.  
 
Instead of leaving all particles in the same data folder, inside the dBox folder there is a subdirectory hierarchy that allows saving the particles in different subfolders. This happens in a way totally transparent for the user.  
  
''Dynamo'' uses a class called {{t|dBoxes}} to manage this kind of [[generic data container]]. Check the different command line options with:
+
''Dynamo'' uses a class called {{t|dBoxes}} to manage this kind of [[Generic data containers|generic data container]]. Check the different command line options with:
  
 
<tt>  
 
<tt>  
Line 14: Line 14:
 
d = dBoxes.convertSimpleData(<foldername>,<dBoxes folder name>);
 
d = dBoxes.convertSimpleData(<foldername>,<dBoxes folder name>);
 
</tt>
 
</tt>
 +
 +
== Internal structure of the dBoxes folder ==
 +
The contents of a dBoxes folder are:
 +
* a  file called <tt>settings.card</tt>
 +
* a file called <tt>tags.em</tt>
 +
* Several folders called  <tt>batch_NN</tt>
 +
 +
=== Settings file ===
 +
It is just a text file (can be edited and changed).
 +
It defines several properties, principally the padding convention used for the tags of the particles and the size of the batch.
 +
 +
=== Tags file===
 +
A binary file storing all tags currently included in the object.
 +
 +
=== Batch folders ===
 +
Each batch folder will contain a maximum number of particles defined by the <tt>batch</tt> property of the dBoxes object, as registered in the settings card. The batch folders are named <tt>batch_N<tt> where N is a multiple of the  <tt>batch</tt> property.  As an exception, in batch folder 0, the container stores particles whose tags range between 1 and <nn>batch-1<tt>.
 +
 +
== Merge several data folders into a single dBox folder ==
 +
 +
If you cannot merge all the data folders independently into a single data folder, you'll have to enter all the particles one by one using <tt>dBoxes.enterParticle</tt>; an example is shown below.
 +
 +
Imagine that you want to merge a cell array of folders (each one associated with a table with the same number of particles). You can generate an example with the tutorial tool:
 +
 +
<nowiki> % create several tutorial folders
 +
 +
for i=1:10;
 +
    testFolder =['testfolder',num2str(i)];
 +
    dtutorial(testFolder,'M',100);
 +
    dataFolder{i} = [testFolder,'/data'];
 +
    tbl{i}      = [testFolder,'/real.tbl'];
 +
end </nowiki>
 +
 +
or use your own cell arrays <tt>dataFolder</tt> and  <tt>tbl</tt>.
 +
 +
<nowiki>% create a dBoxes folder
 +
d = dBoxes('new', 'data.Boxes');
 +
d.padding = 7;
 +
d.batch = 200;  % each subfolder will have only 200 particles
 +
 +
% updates the representation of the object in disk
 +
d.updateSettingsField('padding',7);
 +
d.updateSettingsField('batch',200);
 +
 +
% creates a table that will talk to all elements in the final data folder
 +
globalTable = zeros(0,size(tbl{end},2));
 +
 +
disp('Merging folders.');
 +
timeStart = clock();
 +
for i=1:length(dataFolder)
 +
   
 +
    f = ddinfo(dataFolder{i},'v',0);
 +
    tableForFolder = dread(tbl{i});
 +
   
 +
    % ensures the individual tables are sorted in ascending order of the
 +
    % tags
 +
    tableForFolder = dynamo_table_sort(tableForFolder);
 +
   
 +
    for itag = 1:f.N
 +
       
 +
        % file of particle in original folder
 +
        file = dynamo_tag2file(f.tags(itag),f,0);
 +
       
 +
        %disp(file);
 +
       
 +
        % the file is transferred as copy, without passing the
 +
        % particle map into memory
 +
        d.enterParticle(file,'directcopy',1);
 +
       
 +
    end
 +
   
 +
    % rearrange the tags in the table corresponding to a single folder
 +
    tableForFolder(:,1) = [(size(globalTable,1)+1):(size(globalTable,1)+f.N)]';
 +
   
 +
    globalTable = [globalTable;tableForFolder];
 +
   
 +
    disp(sprintf('finished merging folder %d: %s',i,dataFolder{i}));
 +
end
 +
timeFinish = etime(clock(),timeStart);
 +
disp('done merging. Averaging step.');
 +
 +
%%
 +
o = daverage('data.Boxes','t',globalTable);
 +
 +
disp(sprintf('Seconds for creation of dBoxes %f',timeFinish));
 +
disp('Show average');
 +
dview(o);</nowiki>

Latest revision as of 14:45, 12 October 2017

The dBox folder is an alternate way to store particle files. When you have, say, 100K files, it is a bad idea to put all of them on the same folder.

Instead of leaving all particles in the same data folder, inside the dBox folder there is a subdirectory hierarchy that allows saving the particles in different subfolders. This happens in a way totally transparent for the user.

Dynamo uses a class called dBoxes to manage this kind of generic data container. Check the different command line options with:

help dBoxes

Convert a normal data folder into a dBox folder

d = dBoxes.convertSimpleData(<foldername>,<dBoxes folder name>);

Internal structure of the dBoxes folder

The contents of a dBoxes folder are:

  • a file called settings.card
  • a file called tags.em
  • Several folders called batch_NN

Settings file

It is just a text file (can be edited and changed). It defines several properties, principally the padding convention used for the tags of the particles and the size of the batch.

Tags file

A binary file storing all tags currently included in the object.

Batch folders

Each batch folder will contain a maximum number of particles defined by the batch property of the dBoxes object, as registered in the settings card. The batch folders are named batch_N where N is a multiple of the batch property. As an exception, in batch folder 0, the container stores particles whose tags range between 1 and <nn>batch-1.

Merge several data folders into a single dBox folder

If you cannot merge all the data folders independently into a single data folder, you'll have to enter all the particles one by one using dBoxes.enterParticle; an example is shown below.

Imagine that you want to merge a cell array of folders (each one associated with a table with the same number of particles). You can generate an example with the tutorial tool:

 % create several tutorial folders

for i=1:10;
    testFolder =['testfolder',num2str(i)];
     dtutorial(testFolder,'M',100);
     dataFolder{i} = [testFolder,'/data'];
     tbl{i}      = [testFolder,'/real.tbl'];
end 

or use your own cell arrays dataFolder and tbl.

% create a dBoxes folder
d = dBoxes('new', 'data.Boxes');
d.padding = 7;
d.batch = 200;  % each subfolder will have only 200 particles

% updates the representation of the object in disk
d.updateSettingsField('padding',7);
d.updateSettingsField('batch',200);

% creates a table that will talk to all elements in the final data folder
globalTable = zeros(0,size(tbl{end},2));

disp('Merging folders.');
timeStart = clock();
for i=1:length(dataFolder)
    
    f = ddinfo(dataFolder{i},'v',0);
    tableForFolder = dread(tbl{i}); 
    
    % ensures the individual tables are sorted in ascending order of the
    % tags
    tableForFolder = dynamo_table_sort(tableForFolder);
    
    for itag = 1:f.N
        
        % file of particle in original folder
        file = dynamo_tag2file(f.tags(itag),f,0);
        
        %disp(file);
        
        % the file is transferred as copy, without passing the 
        % particle map into memory
        d.enterParticle(file,'directcopy',1);
        
    end
    
    % rearrange the tags in the table corresponding to a single folder
    tableForFolder(:,1) = [(size(globalTable,1)+1):(size(globalTable,1)+f.N)]';
    
    globalTable = [globalTable;tableForFolder];
    
    disp(sprintf('finished merging folder %d: %s',i,dataFolder{i}));
end
timeFinish = etime(clock(),timeStart);
disp('done merging. Averaging step.');

%%
 o = daverage('data.Boxes','t',globalTable);
 
 disp(sprintf('Seconds for creation of dBoxes %f',timeFinish));
 disp('Show average');
 dview(o);