These types of errors have to do with connection errors between the
different nodes. Is your hardware stable?
S
On 06/05/2013 01:11 PM, Blaas Dieter wrote:
> Hi Sjors and all,
>
> I have been trying very hard to get a 3DR of a small dataset.
> Usually, 2D classification workes fine (if grouping at least 25
> images) but when running the 3DR just using one of the classes the
> program always crashes regardless of various changes of the parameters
> (larger groups, particle diameter, symmetry). I usually get the
> following error messages (the same occurs on the cluster (centos) and
> on the PC (ubuntu)):
>
> 1/ 1 sec
> ............................................................~~(,_,">
> [oo]
> 12/ 12 sec
> ............................................................~~(,_,">
> 14.70/14.70 min
> ............................................................~~(,_,">_,">
> 000/??? sec ~~(,_,"> [oo] 3: MPI_ERR_TRUNCATE: message truncated
> 3: MPI_ERR_TRUNCATE: message truncated
> terminate called after throwing an instance of 'RelionError'
> [ubuntu:07291] *** Process received signal ***
> [ubuntu:07291] Signal: Aborted (6)
> [ubuntu:07291] Signal code: (-6)
> [ubuntu:07291] [ 0] /lib/libpthread.so.0(+0xf8f0) [0x7fdc4cab88f0]
> [ubuntu:07291] [ 1] /lib/libc.so.6(gsignal+0x35) [0x7fdc4c756b25]
> [ubuntu:07291] [ 2] /lib/libc.so.6(abort+0x180) [0x7fdc4c75a670]
> [ubuntu:07291] [ 3]
> /usr/lib/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x115)
> [0x7fdc4cfa98c5]
> [ubuntu:07291] [ 4] /usr/lib/libstdc++.so.6(+0xcacf6) [0x7fdc4cfa7cf6]
> [ubuntu:07291] [ 5] /usr/lib/libstdc++.so.6(+0xcad23) [0x7fdc4cfa7d23]
> [ubuntu:07291] [ 6] /usr/lib/libstdc++.so.6(+0xcae1e) [0x7fdc4cfa7e1e]
> [ubuntu:07291] [ 7]
> /usr/local/relion-1.2-beta12/lib/librelion-1.2.so.1(_ZN7MpiNode16report_MPI_ERROREi+0x141)
> [0x7fdc4ea34d11]
> [ubuntu:07291] *** End of error message ***
> --------------------------------------------------------------------------
>
> mpirun noticed that process rank 3 with PID 7291 on node ubuntu exited
> on signal 6 (Aborted).
>
>
> Any idea?
> Thanks, Dieter
>
>
--
Sjors Scheres
MRC Laboratory of Molecular Biology
Francis Crick Avenue, Cambridge Biomedical Campus
Cambridge CB2 0QH, U.K.
tel: +44 (0)1223 267061
http://www2.mrc-lmb.cam.ac.uk/groups/scheres
|