Hi,
Do you use BeeGFS?
Certain versions of BeeGFS kernel modules seem to have problems
triggered by heavy IO loads under motion correction and Baysian Polishing.
Please see this thread:
https://www.jiscmail.ac.uk/cgi-bin/wa-jisc.exe?A2=ind2009&L=CCPEM&P=R46939.
This issue has been plaguing LMB for many years...
Best regards,
Takanori Nakane
> Has anyone else seen an issue with RELION's own motion correction
> implementation, wherein after processing some of the images, the I/O
> activity goes dead and the run slips into a zombie mode with no further
> processing?
>
> We've found that after cancelling the job and restarting, the remainder of
> the images can then get processed.
>
> I don't know exactly what factors can increase the chance of this
> happening, but it seems increasing the number of threads per MPI rank
> makes the job run slower and makes the zombie mode more likely.
>
> David
> --
> David Hoover, Ph.D.
> Computational Biologist
> High Performance Computing Services,
> Center for Information Technology,
> National Institutes of Health
> 12 South Dr., Rm 2N207
> Bethesda, MD 20892, USA
> TEL: (+1) 301-435-2986
> Email: [log in to unmask]
>
> ########################################################################
>
> To unsubscribe from the CCPEM list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCPEM&A=1
>
> This message was issued to members of www.jiscmail.ac.uk/CCPEM, a mailing
> list hosted by www.jiscmail.ac.uk, terms & conditions are available at
> https://www.jiscmail.ac.uk/policyandsecurity/
>
########################################################################
To unsubscribe from the CCPEM list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCPEM&A=1
This message was issued to members of www.jiscmail.ac.uk/CCPEM, a mailing list hosted by www.jiscmail.ac.uk, terms & conditions are available at https://www.jiscmail.ac.uk/policyandsecurity/
|