Print

Print


Hi Ben,
Yes, this is exactly the idea behind the hybridly parallel code of 
RELION. Please note however, that some steps are not threaded, e.g. the 
initial noise estimation, the estimation of angular accuracies and the 
reconstruction steps. However, the expectation (i.e. alignement) step is 
threaded, and should therefore use more than 100% CPU. This is usually 
the slowest step, so multi-threading should gain you speed.
HTH,
S

On 05/06/2015 11:05 PM, Benoît Zuber wrote:
> Hi Sjors,
>
> Related to this question: is it possible to combine parallel mpi jobs with multithreading? We have a 64-core workstation with 128gb RAM and for some memory intensive steps we thought we would run 8 mpi processes with each 8 threads. So we could combine good computing power and limited RAM needs. But as someone reported the other day, it seems that each mpi process was using a single thread. Indeed top was showing that each mpi process was using 100% and not 800% of cpu. I am not sure if this has any influence but we do not use a queuing system.
>
> Thanks
> Ben
>
>
> Le 6 mai 2015 à 18:36, Sjors Scheres <[log in to unmask]> a écrit :
>
> Chris,
> The upcoming 1.4 release should be more conservative in memory use when reading in large STAR files of the movies.
> If you are the only person using each node, and only 1 job is running on it, then yes: you should be able to take the full 32Gb. For doing large refinements, you might consider finding a cluster with 64Gb nodes. That's what we typically use for our ribosome structures, so that should be enough. I think we've also seen problems using our 6-yrs old nodes with 32Gb.
> HTH,
> S
>
>
>> On 05/05/2015 04:13 PM, Christopher Akey wrote:
>> Users and Sjors:
>>
>> After looking at a post on the issue in March of 2015, it seemed to be a memory issue.
>>
>> For the realign_movie_frames step even though specifying --j 12 threads and 6 nodes
>> for a total of 72 threads, the job ran on 6 nodes using only 1 core per node, based on the top for each node,
>> so the amount of memory available per node was 32 Gb (2.7 Gb/core).
>>
>> However, since no intermediate files are produced, it is dealing with a large amount of data since the input
>> movie star file has a huge number of entries/lines:
>>
>> grep @Particle cl_1-3_movie.star | wc -l
>> 7578396
>>
>> this seems very inefficient.
>>
>> I have 12 frames per movie at about 2.2 e/A2/frame so I am averaging over 3 frames as the ptcls are fairly dense and big.
>>
>> I don't see how to get any more memory for this job.
>>
>> Under these conditions is the job able to use all of the 32Gb per node since only one core on each node is active at 100% ?
>>
>> C Akey

-- 
Sjors Scheres
MRC Laboratory of Molecular Biology
Francis Crick Avenue, Cambridge Biomedical Campus
Cambridge CB2 0QH, U.K.
tel: +44 (0)1223 267061
http://www2.mrc-lmb.cam.ac.uk/groups/scheres