Hi,
> That seems to be specific to our slurm version and we are working on
it but I am wondering if there is an option in relion compilation that
can bypass dependency to libpmi on slurm (coming from openmpi).
Indeed this is specific to your cluster setup.
You can control which MPI library to use during compilation by
setting PATH to mpic and mpicc (or mpicxx) before running cmake.
Of course you have to use the same MPI runtime to launch
RELION by setting PATH and LD_LIBRARY_PATH in your job script.
Best regards,
Takanori Nakane
On 2019/09/11 21:33, Ali Siavosh-Haghighi wrote:
> Hi Takanori,
> I am wondering if there is a way that one can bypass the libpmi* linkage when relion is compiled with openmpi and slurm. We keep getting the following error:
>
> /cm/shared/apps/cuda91/toolkit/9.1.85/targets/x86_64-linux/include /gpfs/share/apps/tiff/3.9.7/include /gpfs/share/apps/openmpi/slurm18_cuda90_3.1.0/include /cm/shared/apps/cuda91/sdk/9.1.85/common/inc
> [gn-0003:97890] mca_base_component_repository_open: unable to open mca_pmix_s1: libslurm.so.32: cannot open shared object file: No such file or directory (ignored)
> [gn-0003:97891] mca_base_component_repository_open: unable to open mca_pmix_s1: libslurm.so.32: cannot open shared object file: No such file or directory (ignored)
> [gn-0003:97890] OPAL ERROR: Not initialized in file pmix2x_client.c at line 109
> [gn-0003:97891] OPAL ERROR: Not initialized in file pmix2x_client.c at line 109
> --------------------------------------------------------------------------
> The application appears to have been direct launched using "srun",
> but OMPI was not built with SLURM's PMI support and therefore cannot
> execute. There are several options for building PMI support under
> SLURM, depending upon the SLURM version you are using:
>
> version 16.05 or later: you can use SLURM's PMIx support. This
> requires that you configure and build SLURM --with-pmix.
>
> Versions earlier than 16.05: you must use either SLURM's PMI-1 or
> PMI-2 support. SLURM builds PMI-1 by default, or you can manually
> install PMI-2. You must then build Open MPI using --with-pmi pointing
> to the SLURM PMI library location.
>
> Please configure as appropriate and try again.
> --------------------------------------------------------------------------
> *** An error occurred in MPI_Init
> *** on a NULL communicator
> *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
> *** and potentially your MPI job)
> [gn-0003:97890] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
> --------------------------------------------------------------------------
> The application appears to have been direct launched using "srun",
> but OMPI was not built with SLURM's PMI support and therefore cannot
> execute. There are several options for building PMI support under
> SLURM, depending upon the SLURM version you are using:
>
> version 16.05 or later: you can use SLURM's PMIx support. This
> requires that you configure and build SLURM --with-pmix.
>
> Versions earlier than 16.05: you must use either SLURM's PMI-1 or
> PMI-2 support. SLURM builds PMI-1 by default, or you can manually
> install PMI-2. You must then build Open MPI using --with-pmi pointing
> to the SLURM PMI library location.
>
> Please configure as appropriate and try again.
> --------------------------------------------------------------------------
> *** An error occurred in MPI_Init
> *** on a NULL communicator
> *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
> *** and potentially your MPI job)
>
> That seems to be specific to our slurm version and we are working on it but I am wondering if there is an option in relion compilation that can bypass dependency to libpmi on slurm (coming from openmpi).
> Thanks
>
> ########################################################################
>
> To unsubscribe from the CCPEM list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCPEM&A=1
>
########################################################################
To unsubscribe from the CCPEM list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCPEM&A=1
|