Print

Print


From the error message, "MlOptimiserMpi::initialiseWorkLoad: at least 2 MPI processes are required," it appears your cluster is not actually allocating Relion any processes.  Possibly this means whichever MPI flavour you have is not respecting the -np # flag.  If that's the case, it probably expects the value to be set in an environment variable instead.  

Do you have queue?  If so, you should look at the various templates for submission scripts on the Relion wiki and see if the magic #! decorators need to be tailored to your cluster.  

Did you have the same flavour of MPI in the PATH and LD_LIBRARY_PATH when you compiled Relion as you are using now?

Relion needs a special parallel environment that is usually close to Open MPI Hybrid.  It doesn't request threads from the MPI environment though, so if you have an environment that binds processes to specific cores, you might have a problem where all X threads try to run on a single core.  This is usually more of a problem with very recent grid engines. Certainly your thread request (--j) is wrong.  I assume your 16-core machines are actually dual 8-core CPUs, in which case you should put two 8-thread processes on each, unless you run out of RAM.  

Intel has good forums that the developers watch so you might have more success finding  the one for your cluster management software and asking there.  Every cluster environment is unique and it often requires specific tweaks to get any program running properly.

Robert

-- 
Robert McLeod, Ph.D.
Center for Cellular Imaging and Nano Analytics (C-CINA)
Biozentrum der Universität Basel
Mattenstrasse 26, 4058 Basel
Office: +41.061.387.3225
[log in to unmask]

From: Collaborative Computational Project in Electron cryo-Microscopy [[log in to unmask]] on behalf of abhisek Mondal [[log in to unmask]]
Sent: Saturday, August 20, 2016 7:31 AM
To: [log in to unmask]
Subject: [ccpem] mpi error in relion-1.3

Hi,
    I'm trying to run a 2D classification program in relion-1.3, which is currently installed in a cluster environment.

   I'm providing "mpirun -np 1000 relion_refine_mpi <program_particulars> --j 30" to run the program on 30 nodes (with 30*16 = 480 threads).

   But every time the program crashes with the following error message:
"

MlOptimiserMpi::initialiseWorkLoad: at least 2 MPI processes are required, otherwise use the sequential program
File: src/ml_optimiser_mpi.cpp line: 173
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
with errorcode 1.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
"

    Can you guys please help me out regarding what may be I'm doing wrong ?

Thank you



--
Abhisek Mondal
Research Fellow
Structural Biology and Bioinformatics Division
CSIR-Indian Institute of Chemical Biology
Kolkata 700032
INDIA