To avoid running out of RAM, I would suggest launching only a single MPI
job on each computing node and then use as many threads as you have
cores. You should also monitor (e.g. using top) on the computing nodes
that what you think should happen, is actually the case.
S
On 10/21/2014 09:46 AM, SUBSCRIBE CCPEM SIddhanta wrote:
> That error 'target and source had different sizes' occurred once, and for the first time. It never occurred again. So maybe that was just a glitch?
>
> I tried many configurations as suggested by the others, with the command --dont_combine_weights_via_disc, but to no avail. Segmentation fault still occurs. I also experimented with a significantly smaller data set, but I get the same results.
>
> Now. along with the segmentation fault, I am also getting this message.
>
> [cryo-c7][[6502,1],29][btl_tcp_frag.c:215:mca_btl_tcp_frag_recv] mca_btl_tcp_frag_recv: readv failed: Connection reset by peer (104)
> --------------------------------------------------------------------------
> mpiexec noticed that process rank 13 with PID 56392 on node cryo-c5 exited on signal 11 (Segmentation fault).
>
> The segmentation fault is always occurring at the Writing out polished particles stage.
>
> Qiu-Xing Jiang, I'm not sure if it's still an addressing issue or not. But the same segmentation fault is still occurring.
>
--
Sjors Scheres
MRC Laboratory of Molecular Biology
Francis Crick Avenue, Cambridge Biomedical Campus
Cambridge CB2 0QH, U.K.
tel: +44 (0)1223 267061
http://www2.mrc-lmb.cam.ac.uk/groups/scheres
|