Memory/speed balance during ccmatrix computation

From Dynamo
Jump to: navigation, search

A ccmatrix is computed in batches. If the data set containt N particles and the user sets a batch of M particles, Dynamo will compute the NxN matrix in blocks, each one with a size of M x M. During the computation of each block, Dynamo will read M particles and again a different M particles (unless the blog is in the diagonal, in which case the particles are the same), align them once using the table, align the missing wedges and then proceed to compute the MxM correlations.

This creates a trade off: if M is very small (in the extreme case, 1), then you will need many more blocks than necessary. A particle i will appear in many independent blocks (in the extremal case M=1, each particle will be read 'N' times), being read from disk and aligned each time. We don't want this. But in the other hand, if M is too big (in the extreme case, M=N, so that each particle gets read from disk and aligned just once), then you might need a huge amount of memory to contain all the aligned particles and missing wedges in the RAM.

The user can use the Memory menu tab in the dynamo_ccmatrix_project_manager to get some hints about how to tune the batch parameter, which corresponds to M. Basically the user is informed on the expected size in memory of one MxM block.

Note that if a parallel computing destination has ben chosen, each so defined block will be assigned to a different core. This has to be taken into account when deciding the size of the batch. For instance, if you are going to use 32 cores in a workstation, take into account that each one will be processing a different batch. If your batch size allows 4Gb, you are assuming that your workstation has 128Gb.