Dear all,
We have a problem with the Bayesian Polish step. And we don’t understand why our job is not finishing at all. The training step works fine, and I got some results back. When I start the polishing job itself, it starts but when it gets to the step Performing loop over all micrographs it either aborts with signal 6 (aborted), or with signal 9 (killed). I can restart it but then it would do a bit more, and in the same step after a while it fails.
When I try long enough, the output will say motion has already been estimated for all micrographs. will recombine grams for all micrographs - none are finished
And then you have the step fitting B/k-factores between 15 and 54 pixels, or 20 and 5.7 Angstrom. On this step nothing is done. After a while you can see that the job is killed on the CPUs.
We have asked our cluster manager how to set it up with the MPI processes and the threats. We have tried many different options. I can’t use any threats And what we see is that this job is taking all the 500 GB of RAM and starts using the SWAP memory as well. As soon as it is the SWAP it is killed.
How can we avoid that it is using all the RAM? And how can we get our job to finish.
Best regards
Laura
########################################################################
To unsubscribe from the CCPEM list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCPEM&A=1
|