Hi all,
We are getting the following error during relion_auto_refine run. It
happens only after 8th iteration.
The command used:
$ mpirun -np 188 -machinefile /user/mbotte/.openmpi-farm-hostfile
`which relion_refine_mpi` --o Refine3D/140312_2/run1 --i
relion/140303_input/all_images_ori.star --particle_diameter 180 --angpix
1.36 --ref relion/Refine3D/140305/run1_class001.mrc --flatten_solvent
--sym C1 --oversampling 1 --auto_refine --split_random_halves
--low_resol_join_halves 40 --healpix_order 3 --offset_range 10
--offset_step 2 --auto_local_healpix_order 5 --norm --scale --j 1
-------------------
Expectation iteration 8
28.61/28.61 hrs
............................................................~~(,_,">
MultidimArray::write: File Refine3D/140312_2/run1_rank000187.tmp cannot
be opened for output
File: ./src/multidim_array.h line: 3945
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 187 in communicator MPI_COMM_WORLD
with errorcode 1.
NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
[sky50][[24417,1],123][../../../../../../ompi/mca/btl/tcp/btl_tcp_frag.c:216:mca_btl_tcp_frag_recv]
mca_btl_tcp_frag_recv: readv failed: Connection reset by peer (104)
[sky57][[24417,1],171][../../../../../../ompi/mca/btl/tcp/btl_tcp_frag.c:216:mca_btl_tcp_frag_recv]
mca_btl_tcp_frag_recv: readv failed: Connection reset by peer (104)
[sky56][[24417,1],155][../../../../../../ompi/mca/btl/tcp/btl_tcp_frag.c:216:mca_btl_tcp_frag_recv]
mca_btl_tcp_frag_recv: readv failed: Connection reset by peer (104)
--------------------------------------------------------------------------
mpirun has exited due to process rank 187 with PID 11357 on
node sky58 exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--------------------------------------------------------------------------
---------------------
Please help us to fix this.
with kind regards
Mani.
|