hi,
the allocation can depend on the used mpi version (openmpi, mvapich2,
...) and the parameters you set to mpirun/mpiexec e.g. --map-by node.
for openmpi are below some hints to find out, if the problem is on the
gridengine side or in your relion job submission template:
openmpi must be compiled with --with-sge flag at configure level.
to verify the gridengine settings look at e.g. if your parallel
environment is called orte (openmpi runtime environment):
qconf -sp orte
:
allocation_rule $round_robin
control_slaves TRUE
job_is_first_task FALSE
:
then you can add the following line to your relion script template (or a
similar mpi testscript), assuming relion threads = 1:
#$ -pe orte 150-220
or
#$ -pe orte XXXmpinodesXXX
:
cat ${PE_HOSTFILE}
echo ${NSLOTS}
mpiexec -n ${NSLOTS} XXXcommandXXX
NSLOTS should be 208 if your cluster is empty or all nodes can be assigned.
the PE_HOSTFILE contains the assigned hosts and number of cores on this
host.
in your case it should contain 26 lines with each 8 cores like
node1 8 openmpi.q@node1 UNDEFINED
node2 8 openmpi.q@node2 UNDEFINED
:
hope this helps.
cheers,
wolfgang
On 07/13/2015 01:48 PM, David Bhella wrote:
> Hi,
>
> I wanted to ask a simple question about the parallelisation of tasks in Relion. Briefly it appears that Relion requires one node to be completely idle for every MPI task. This would seem to be incompatible with our queuing system. SGE is currently set to distribute tasks across all nodes evenly. We have 26 nodes each with 8 cores. A colleague has just submitted a very long job that is divided into 80 tasks. These have been assigned to all 26 nodes, so fewer than half of each nodes’ cores are occupied, however my Relion 2D classification run won’t start as the queue is showing that there are insufficient resources.
>
> I presume that I have something misconfigured, but I am not sure if I need to be looking at the SGE configuration or Relion.
>
> Any advice gratefully received.
>
> Thanks,
> D.
>
> Dr David Bhella
> MRC-University of Glasgow Centre for Virus Research
> Garscube Campus
> 464 Bearsden Road
> Glasgow G61 1QH
> Scotland (UK)
>
> Telephone: 0141-330-3685
> Skype: d.bhella
>
> Virus structure group on Facebook: https://www.facebook.com/CVRstructure
> Molecular Machines - Images from Virus Research: http://www.molecularmachines.org.uk
>
> CVR website: http://www.cvr.ac.uk
> CVR on Facebook: https://www.facebook.com/centreforvirusresearch
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
--
Universitätsklinikum Hamburg-Eppendorf (UKE)
@ Centre for Structral Systems Biology (CSSB)
@ Institute of Molecular Biotechnology (IMBA)
Dr. Bohr-Gasse 3-7 (Room 6.14)
1030 Vienna, Austria
Tel.: +43 (1) 790 44-4649
Email: [log in to unmask]
http://www.cssb-hamburg.de/
--
_____________________________________________________________________
Universitätsklinikum Hamburg-Eppendorf; Körperschaft des öffentlichen Rechts; Gerichtsstand: Hamburg | www.uke.de
Vorstandsmitglieder: Prof. Dr. Burkhard Göke (Vorsitzender), Prof. Dr. Dr. Uwe Koch-Gromus, Joachim Prölß, Rainer Schoppik
_____________________________________________________________________
SAVE PAPER - THINK BEFORE PRINTING
|