Difference between revisions of "DBox folder"

From Dynamo
Jump to navigation Jump to search
 
(3 intermediate revisions by the same user not shown)
Line 14: Line 14:
 
d = dBoxes.convertSimpleData(<foldername>,<dBoxes folder name>);
 
d = dBoxes.convertSimpleData(<foldername>,<dBoxes folder name>);
 
</tt>
 
</tt>
 +
 +
== Internal structure of the dBoxes folder ==
 +
The contents of a dBoxes folder are:
 +
* a  file called <tt>settings.card</tt>
 +
* a file called <tt>tags.em</tt>
 +
* Several folders called  <tt>batch_NN</tt>
 +
 +
=== Settings file ===
 +
It is just a text file (can be edited and changed).
 +
It defines several properties, principally the padding convention used for the tags of the particles and the size of the batch.
 +
 +
=== Tags file===
 +
A binary file storing all tags currently included in the object.
 +
 +
=== Batch folders ===
 +
Each batch folder will contain a maximum number of particles defined by the <tt>batch</tt> property of the dBoxes object, as registered in the settings card. The batch folders are named <tt>batch_N<tt> where N is a multiple of the  <tt>batch</tt> property.  As an exception, in batch folder 0, the container stores particles whose tags range between 1 and <nn>batch-1<tt>.
  
 
== Merge several data folders into a single dBox folder ==
 
== Merge several data folders into a single dBox folder ==
  
If you cannot merge all the data folders independently into a single data folder, you'll have to enter all the particles one by one using <tt>dBoxes.enterParticle</tt>;
+
If you cannot merge all the data folders independently into a single data folder, you'll have to enter all the particles one by one using <tt>dBoxes.enterParticle</tt>; an example is shown below.
 +
 
 +
Imagine that you want to merge a cell array of folders (each one associated with a table with the same number of particles). You can generate an example with the tutorial tool:
 +
 
 +
<nowiki> % create several tutorial folders
 +
 
 +
for i=1:10;
 +
    testFolder =['testfolder',num2str(i)];
 +
    dtutorial(testFolder,'M',100);
 +
    dataFolder{i} = [testFolder,'/data'];
 +
    tbl{i}      = [testFolder,'/real.tbl'];
 +
end </nowiki>
 +
 
 +
or use your own cell arrays <tt>dataFolder</tt> and  <tt>tbl</tt>.
 +
 
 +
<nowiki>% create a dBoxes folder
 +
d = dBoxes('new', 'data.Boxes');
 +
d.padding = 7;
 +
d.batch = 200;  % each subfolder will have only 200 particles
 +
 
 +
% updates the representation of the object in disk
 +
d.updateSettingsField('padding',7);
 +
d.updateSettingsField('batch',200);
 +
 
 +
% creates a table that will talk to all elements in the final data folder
 +
globalTable = zeros(0,size(tbl{end},2));
 +
 
 +
disp('Merging folders.');
 +
timeStart = clock();
 +
for i=1:length(dataFolder)
 +
   
 +
    f = ddinfo(dataFolder{i},'v',0);
 +
    tableForFolder = dread(tbl{i});
 +
   
 +
    % ensures the individual tables are sorted in ascending order of the
 +
    % tags
 +
    tableForFolder = dynamo_table_sort(tableForFolder);
 +
   
 +
    for itag = 1:f.N
 +
       
 +
        % file of particle in original folder
 +
        file = dynamo_tag2file(f.tags(itag),f,0);
 +
       
 +
        %disp(file);
 +
       
 +
        % the file is transferred as copy, without passing the
 +
        % particle map into memory
 +
        d.enterParticle(file,'directcopy',1);
 +
       
 +
    end
 +
   
 +
    % rearrange the tags in the table corresponding to a single folder
 +
    tableForFolder(:,1) = [(size(globalTable,1)+1):(size(globalTable,1)+f.N)]';
 +
   
 +
    globalTable = [globalTable;tableForFolder];
 +
   
 +
    disp(sprintf('finished merging folder %d: %s',i,dataFolder{i}));
 +
end
 +
timeFinish = etime(clock(),timeStart);
 +
disp('done merging. Averaging step.');
 +
 
 +
%%
 +
o = daverage('data.Boxes','t',globalTable);
 +
 +
disp(sprintf('Seconds for creation of dBoxes %f',timeFinish));
 +
disp('Show average');
 +
dview(o);</nowiki>

Latest revision as of 14:45, 12 October 2017

The dBox folder is an alternate way to store particle files. When you have, say, 100K files, it is a bad idea to put all of them on the same folder.

Instead of leaving all particles in the same data folder, inside the dBox folder there is a subdirectory hierarchy that allows saving the particles in different subfolders. This happens in a way totally transparent for the user.

Dynamo uses a class called dBoxes to manage this kind of generic data container. Check the different command line options with:

help dBoxes

Convert a normal data folder into a dBox folder

d = dBoxes.convertSimpleData(<foldername>,<dBoxes folder name>);

Internal structure of the dBoxes folder

The contents of a dBoxes folder are:

  • a file called settings.card
  • a file called tags.em
  • Several folders called batch_NN

Settings file

It is just a text file (can be edited and changed). It defines several properties, principally the padding convention used for the tags of the particles and the size of the batch.

Tags file

A binary file storing all tags currently included in the object.

Batch folders

Each batch folder will contain a maximum number of particles defined by the batch property of the dBoxes object, as registered in the settings card. The batch folders are named batch_N where N is a multiple of the batch property. As an exception, in batch folder 0, the container stores particles whose tags range between 1 and <nn>batch-1.

Merge several data folders into a single dBox folder

If you cannot merge all the data folders independently into a single data folder, you'll have to enter all the particles one by one using dBoxes.enterParticle; an example is shown below.

Imagine that you want to merge a cell array of folders (each one associated with a table with the same number of particles). You can generate an example with the tutorial tool:

 % create several tutorial folders

for i=1:10;
    testFolder =['testfolder',num2str(i)];
     dtutorial(testFolder,'M',100);
     dataFolder{i} = [testFolder,'/data'];
     tbl{i}      = [testFolder,'/real.tbl'];
end 

or use your own cell arrays dataFolder and tbl.

% create a dBoxes folder
d = dBoxes('new', 'data.Boxes');
d.padding = 7;
d.batch = 200;  % each subfolder will have only 200 particles

% updates the representation of the object in disk
d.updateSettingsField('padding',7);
d.updateSettingsField('batch',200);

% creates a table that will talk to all elements in the final data folder
globalTable = zeros(0,size(tbl{end},2));

disp('Merging folders.');
timeStart = clock();
for i=1:length(dataFolder)
    
    f = ddinfo(dataFolder{i},'v',0);
    tableForFolder = dread(tbl{i}); 
    
    % ensures the individual tables are sorted in ascending order of the
    % tags
    tableForFolder = dynamo_table_sort(tableForFolder);
    
    for itag = 1:f.N
        
        % file of particle in original folder
        file = dynamo_tag2file(f.tags(itag),f,0);
        
        %disp(file);
        
        % the file is transferred as copy, without passing the 
        % particle map into memory
        d.enterParticle(file,'directcopy',1);
        
    end
    
    % rearrange the tags in the table corresponding to a single folder
    tableForFolder(:,1) = [(size(globalTable,1)+1):(size(globalTable,1)+f.N)]';
    
    globalTable = [globalTable;tableForFolder];
    
    disp(sprintf('finished merging folder %d: %s',i,dataFolder{i}));
end
timeFinish = etime(clock(),timeStart);
disp('done merging. Averaging step.');

%%
 o = daverage('data.Boxes','t',globalTable);
 
 disp(sprintf('Seconds for creation of dBoxes %f',timeFinish));
 disp('Show average');
 dview(o);