DBox folder

From Dynamo
Revision as of 14:44, 12 October 2017 by Daniel Castaño (talk | contribs)
Jump to navigation Jump to search

The dBox folder is an alternate way to store particle files. When you have, say, 100K files, it is a bad idea to put all of them on the same folder.

Instead of leaving all particles in the same data folder, inside the dBox folder there is a subdirectory hierarchy that allows saving the particles in different subfolders. This happens in a way totally transparent for the user.

Dynamo uses a class called dBoxes to manage this kind of generic data container. Check the different command line options with:

help dBoxes

Convert a normal data folder into a dBox folder

d = dBoxes.convertSimpleData(<foldername>,<dBoxes folder name>);

Internal structure of the dBoxes folder

The contents of a dBoxes folder are:

  • a file called settings.card
  • a file called tags.em
  • Several folders called batch_

Settings file

It is just a text file (can be edited and changed). It defines several properties, principally the padding convention used for the tags of the particles and the size of the batch.

Batch folders

Each batch folder will contain a maximum number of particles defined by the batch property of the dBoxes object, as registered in the settings card. The batch folders are named batch_N where N is a multiple of the batch property. As an exception, in batch folder 0, the container stores particles whose tags range between 1 and <nn>batch-1.

Merge several data folders into a single dBox folder

If you cannot merge all the data folders independently into a single data folder, you'll have to enter all the particles one by one using dBoxes.enterParticle; an example is shown below.

Imagine that you want to merge a cell array of folders (each one associated with a table with the same number of particles). You can generate an example with the tutorial tool:

 % create several tutorial folders

for i=1:10;
    testFolder =['testfolder',num2str(i)];
     dtutorial(testFolder,'M',100);
     dataFolder{i} = [testFolder,'/data'];
     tbl{i}      = [testFolder,'/real.tbl'];
end 

or use your own cell arrays dataFolder and tbl.

% create a dBoxes folder
d = dBoxes('new', 'data.Boxes');
d.padding = 7;
d.batch = 200;  % each subfolder will have only 200 particles

% updates the representation of the object in disk
d.updateSettingsField('padding',7);
d.updateSettingsField('batch',200);

% creates a table that will talk to all elements in the final data folder
globalTable = zeros(0,size(tbl{end},2));

disp('Merging folders.');
timeStart = clock();
for i=1:length(dataFolder)
    
    f = ddinfo(dataFolder{i},'v',0);
    tableForFolder = dread(tbl{i}); 
    
    % ensures the individual tables are sorted in ascending order of the
    % tags
    tableForFolder = dynamo_table_sort(tableForFolder);
    
    for itag = 1:f.N
        
        % file of particle in original folder
        file = dynamo_tag2file(f.tags(itag),f,0);
        
        %disp(file);
        
        % the file is transferred as copy, without passing the 
        % particle map into memory
        d.enterParticle(file,'directcopy',1);
        
    end
    
    % rearrange the tags in the table corresponding to a single folder
    tableForFolder(:,1) = [(size(globalTable,1)+1):(size(globalTable,1)+f.N)]';
    
    globalTable = [globalTable;tableForFolder];
    
    disp(sprintf('finished merging folder %d: %s',i,dataFolder{i}));
end
timeFinish = etime(clock(),timeStart);
disp('done merging. Averaging step.');

%%
 o = daverage('data.Boxes','t',globalTable);
 
 disp(sprintf('Seconds for creation of dBoxes %f',timeFinish));
 disp('Show average');
 dview(o);