Dear All,
I have been trying to proceed with Local symmetry in Relion.
In our environment I tried to run using R3.0 and R3.1
#!/bin/zsh
#SBATCH --partition all
#SBATCH --time 7-00:00
#SBATCH --constraint E5-2640&256G
#SBATCH --ntasks 80
#SBATCH --cpus-per-task 1
#SBATCH --error run.err
#SBATCH --output run.out
#SBATCH --job-name relion3i
#SBATCH --open-mode append
#SBATCH --no-requeue
#error SBATCH --export=NONE
#error SBATCH --export=LD_PRELOAD=""
export LD_PRELOAD=""
# run the job
echo "### start : $(date) ###"
echo "### master: $(hostname) ###"
echo "### jobid : ${SLURM_JOB_ID} ###"
source /etc/profile.d/modules.sh
module use /beegfs/cssb/software/modulefiles/em
module load relion/3.0-icc2020-altcpu
# check myself if the relion gpu flag is off
FOUND=$(grep '\-\-gpu' ${0})
if [ $? -ne 1 ]; then
echo "Warning: This is a CPU submission script template!"
echo "Warning: It seems that the 'Use GPU acceleration' flag in the
Relion 'Compute' tab is enabled."
echo "ERROR: The Relion gpu flag is turned on - exiting!!!"
exit 1
fi
mpirun --map-by node --mca opal_warn_on_missing_libcuda 0 \
-n 80 relion_localsym_mpi--search --i_map run_ct13_class001.mrc
--i_op_mask_info D1_2_masks.star --o_mask_info D1D2_iter000.star
--angpix 0.87 --ang_step 5
echo "### end: $(date) ###"
got this error
ERROR: ld.so: object 'libdlfaker.so' from LD_PRELOAD cannot be
preloaded: ignored.
ERROR: ld.so: object 'libvglfaker.so' from LD_PRELOAD cannot be
preloaded: ignored.
CUDA 10.1 loaded
pocl warning: encountered incomplete implementation in
clGetDeviceInfo.c:169
[max-ferrari023:25272:0:25272] Caught signal 11 (Segmentation fault:
address not mapped to object at address (nil))
/beegfs/cssb/software/em/relion/3.0-icc2020-altcpu/src/matrix1d.h: [
Matrix1D<double>::resize() ]
...
458 {
459 T val;
460 if (j >= vdim)
==> 461 val = 0;
462 else
463 val = vdata[j];
464 new_vdata[j] = val;
==== backtrace (tid: 25272) ====
0 0x0000000000478adb Matrix1D<double>::resize()
/beegfs/cssb/software/em/relion/3.0-icc2020-altcpu/src/matrix1d.h:461
1 0x0000000000478adb Matrix1D<double>::initZeros()
/beegfs/cssb/software/em/relion/3.0-icc2020-altcpu/src/matrix1d.h:604
2 0x0000000000478adb Localsym_composeOperator()
/beegfs/cssb/software/em/relion/3.0-icc2020-altcpu/src/local_symmetry.cpp:131
3 0x0000000000429c39 local_symmetry_parameters_mpi::run()
/beegfs/cssb/software/em/relion/3.0-icc2020-altcpu/src/local_symmetry_mpi.cpp:93
4 0x000000000041fe15 main()
/beegfs/cssb/software/em/relion/3.0-icc2020-altcpu/src/apps/localsym_mpi.cpp:10
5 0x0000000000022505 __libc_start_main() ???:0
6 0x000000000041fc29 _start() ???:0
=================================
[max-ferrari023:25272] *** Process received signal ***
[max-ferrari023:25272] Signal: Segmentation fault (11)
[max-ferrari023:25272] Signal code: (-6)
[max-ferrari023:25272] Failing at address: 0x89c8000062b8
[max-ferrari023:25272] [ 0]
/lib64/libpthread.so.0(+0xf5f0)[0x2b42952455f0]
[max-ferrari023:25272] [ 1]
relion_localsym_mpi(_Z24Localsym_composeOperatorR8Matrix1DIdEddddddd+0x9b)[0x478adb]
[max-ferrari023:25272] [ 2]
relion_localsym_mpi(_ZN29local_symmetry_parameters_mpi3runEv+0x9e9)[0x429c39]
[max-ferrari023:25272] [ 3] relion_localsym_mpi(main+0x125)[0x41fe15]
[max-ferrari023:25272] [ 4]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x2b4295474505]
[max-ferrari023:25272] [ 5] relion_localsym_mpi[0x41fc29]
[max-ferrari023:25272] *** End of error message ***
--------------------------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 0 on node max-ferrari023
exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------
when I tried the non-parallel mpi I got this one:
ERROR: ld.so: object 'libdlfaker.so' from LD_PRELOAD cannot be
preloaded: ignored.
ERROR: ld.so: object 'libvglfaker.so' from LD_PRELOAD cannot be
preloaded: ignored.
CUDA 10.1 loaded
pocl warning: encountered incomplete implementation in
clGetDeviceInfo.c:169
--------------------------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that process rank 7 with PID 38907 on node max-wn013
exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------
(base)
[koubatom@max-cssb-display002]/beegfs/cssb/user/koubatom/LASV_C_R3p1/dimer%
Do you have any advice what could have gone wrong?
Thank you
Tomas
########################################################################
To unsubscribe from the CCPEM list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCPEM&A=1
|