Hi all,
I recently encountered an MPI error when running 3D auto-refine in relion 3.0-beta. Here's the error message I received:
1: Message truncated
1: Message truncated, error stack:
PMPI_Wait(203).....: MPI_Wait(request=0x7ffd62293d80, status=0x7ffd62293df0) failed
MPIR_Wait_impl(100):
do_cts(553)........: Message truncated; 9891936 bytes received but buffer size is 4304016
in: /opt/relion/relion-3.0_beta/src/mpi.cpp, line 307
=== Backtrace ===
/opt/relion/x64/relion3.0-beta/bin/relion_refine_mpi(_ZN11RelionErrorC2ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES7_l+0x6d) [0x50999d]
/opt/relion/x64/relion3.0-beta/bin/relion_refine_mpi(_ZN7MpiNode16report_MPI_ERROREi+0xdb) [0x55643b]
/opt/relion/x64/relion3.0-beta/bin/relion_refine_mpi(_ZN7MpiNode15relion_MPI_RecvEPvliiiiR10MPI_Status+0x193) [0x556793]
/opt/relion/x64/relion3.0-beta/bin/relion_refine_mpi(_ZN14MlOptimiserMpi16compareTwoHalvesEv+0x3f3) [0x4403f3]
/opt/relion/x64/relion3.0-beta/bin/relion_refine_mpi(_ZN14MlOptimiserMpi7iterateEv+0x27e) [0x44d82e]
/opt/relion/x64/relion3.0-beta/bin/relion_refine_mpi(main+0x206b) [0x42f95b]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0) [0x7fe95b048830]
/opt/relion/x64/relion3.0-beta/bin/relion_refine_mpi(_start+0x29) [0x431eb9]
==================
ERROR:
Encountered an MPI-related error, see above. Now exiting...
application called MPI_Abort(MPI_COMM_WORLD, 1) - process 1
To be more detailed, I had done a 3D auto-refine on a group of particles and everything went OK. I then performed a 3D classification into 5 classes and took one of the 5 classes to do further refinement, which gave me the error above. When I tested 3D auto-refine on the other 4 classes, everything was fine. This error persists when I change the number of MPI processes during the refinement. Has anyone seen similar issues before?
Thanks a lot!
Yan
########################################################################
To unsubscribe from the CCPEM list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCPEM&A=1
|