Sjors and users:
With regards to memory issues and movie ptcl processing.
We recently ran a movie align job > realign_movie_frames as shown in the
command below with about 35000 ptcls at 320 box size and it ran on our
cluster in ~15 hrs on all the cores without a memory problem for the full
job!
mpiexec -bynode -n 12 `which relion_refine_mpi` --o Refine3D/run1_ct26
--continue Refine3D/lsu+70S/run1_ct12_it026_optimiser.star --oversampling
1 --healpix_order 2 --auto_local_healpix_order 4 --offset_range 5
--offset_step 2 --realign_movie_frames e2multi1+3_movie.star
--movie_frames_running_avg 5 --sigma_off 2 --skip_rotate --skip_maximize
--j 8
in the .out file>
=== RELION MPI setup ===
+ Number of MPI processes = 12
+ Number of threads per MPI process = 8
+ Total number of threads therefore = 96
.
.
.
Expanding current model for movie frames...
Auto-refine: Iteration= 27
Auto-refine: Resolution= 9.67442 (no gain for 0 iter)
Auto-refine: Changes in angles= 999 degrees; and in offsets= 999 pixels
(no gain for 0 iter)
Auto-refine: Refinement has converged, entering last iteration where two
halves will be combined...
Auto-refine: Angular step= 15 degrees; local searches= false
Auto-refine: Offset search range= 6 pixels; offset step= 1.5 pixels
CurrentResolution= 9.67442 Angstroms, which requires orientationSampling
of at least 3.07692 degrees for a particle of diameter 360 Angstroms
Oversampling= 0 NrHiddenVariableSamplingPoints= 49
OrientationalSampling= 15 NrOrientations= 1
TranslationalSampling= 1.5 NrTranslations= 49
=============================
Estimated memory for expectation step > 0.472081 Gb, available memory =
12 Gb.
Estimated memory for maximization step > 0.669746 Gb, available memory =
12 Gb.
Expectation iteration 27
14.39/14.39 hrs
............................................................~~(,_,">
Auto-refine: Skipping maximization step, so stopping now...
Hoever, I am running a job now with 42K ptcls at 348 box size with the
same cluster and I had to divide the job in two halves (I made two
independent movie.star files from the .star and movie.mrcs ptcl files) to
get it to even run and the half job has now run 48 hrs with no sign of
progressing past the expansion step:
So something must be wrong but I am at a loss what to try. The job is
using only one core on each of the 6 nodes, and this core is using 30Gb of
total 32 Gb on each node for one thread/core. Any suggestions as to why
there is such a big difference, at this rate it will take a very long
time, there is of course no indication whatsoever of how far along the job
is in the expansion phase (not sure either what it is doing at this step).
=== RELION MPI setup ===
+ Number of MPI processes = 6
+ Number of threads per MPI process = 12
+ Total number of threads therefore = 72
+ Master (0) runs on host = compute-0-11.local
+ Slave 1 runs on host = compute-0-6.local
+ Slave 4 runs on host = compute-0-8.local
=================
+ Slave 5 runs on host = compute-0-10.local
+ Slave 2 runs on host = compute-0-7.local
+ Slave 3 runs on host = compute-0-9.local
Expanding current model for movie frames...
C Akey
>>
>>
>> Le 6 mai 2015 à 18:36, Sjors Scheres <[log in to unmask]> a écrit
>> :
>>
>> Chris,
>> The upcoming 1.4 release should be more conservative in memory use when
>> reading in large STAR files of the movies.
>> If you are the only person using each node, and only 1 job is running on
>> it, then yes: you should be able to take the full 32Gb. For doing large
>> refinements, you might consider finding a cluster with 64Gb nodes.
>> That's what we typically use for our ribosome structures, so that should
>> be enough. I think we've also seen problems using our 6-yrs old nodes
>> with 32Gb.
>> HTH,
>> S
>>
>>
>>> On 05/05/2015 04:13 PM, Christopher Akey wrote:
>>> Users and Sjors:
>>>
>>> After looking at a post on the issue in March of 2015, it seemed to be
>>> a memory issue.
>>>
>>> For the realign_movie_frames step even though specifying --j 12 threads
>>> and 6 nodes
>>> for a total of 72 threads, the job ran on 6 nodes using only 1 core per
>>> node, based on the top for each node,
>>> so the amount of memory available per node was 32 Gb (2.7 Gb/core).
>>>
>>> However, since no intermediate files are produced, it is dealing with a
>>> large amount of data since the input
>>> movie star file has a huge number of entries/lines:
>>>
>>> grep @Particle cl_1-3_movie.star | wc -l
>>> 7578396
>>>
>>> this seems very inefficient.
>>>
>>> I have 12 frames per movie at about 2.2 e/A2/frame so I am averaging
>>> over 3 frames as the ptcls are fairly dense and big.
>>>
>>> I don't see how to get any more memory for this job.
>>>
>>> Under these conditions is the job able to use all of the 32Gb per node
>>> since only one core on each node is active at 100% ?
>>>
>>> C Akey
>
> --
> Sjors Scheres
> MRC Laboratory of Molecular Biology
> Francis Crick Avenue, Cambridge Biomedical Campus
> Cambridge CB2 0QH, U.K.
> tel: +44 (0)1223 267061
> http://www2.mrc-lmb.cam.ac.uk/groups/scheres
>
|