Hi Kostas, if can help, in the Italian Grid we found that MPI didn't work for torque if the CE GRIS published GlueCEInfoLRMSType=torque as is in your case for the alexander.it.uom.gr CE. After putting GlueCEInfoLRMSType=pbs our MPI implementation (http://grid-it.cnaf.infn.it/index.php?mpihowto&type=1) worked. best regards, Marco Kostas Georgakopoulos wrote: > Hi all, > > i configured our site (GR-02-UoM) for mpi support following the > instructions in > http://goc.grid.sinica.edu.tw/gocwiki/MPI_Support_with_Torque > (torque is the job manager for us) and it seems that everything is ok. > However i tried executing the test job from > http://quattor.web.lal.in2p3.fr/packages/mpi > and the job get stuck in one of the workers till the proxy certificate > expires. The command used to submit the job was: > > edg-job-submit --vo dteam --lrms pbs -r > alexander.it.uom.gr:2119/jobmanager-lcgpbs-dteam MPItest.jdl > > has anyone have any idea what the problem might be? (i include the > files below). > > Best regards > Kostas Georgakopoulos > University of Macedonia > > MPItest.jdl: > > Type = "Job"; > JobType = "MPICH"; > NodeNumber = 8; > Executable = "MPItest.sh"; > Arguments = "MPItest"; > StdOutput = "test.out"; > StdError = "test.err"; > InputSandbox = {"MPItest.sh","MPItest.c"}; > OutputSandbox = {"test.err","test.out","mpiexec.out"}; > > MPItest.sh: > > #!/bin/sh -x > > # the binary to execute > EXE=$1 > > echo > "***********************************************************************" > echo "Running on: $HOSTNAME" > echo "As: " `whoami` > echo > "***********************************************************************" > > echo > "***********************************************************************" > echo "Compiling binary: $EXE" > echo mpicc -o ${EXE} ${EXE}.c > mpicc -o ${EXE} ${EXE}.c > echo "*************************************" > > if [ "x$PBS_NODEFILE" != "x" ] ; then > echo "PBS Nodefile: $PBS_NODEFILE" > HOST_NODEFILE=$PBS_NODEFILE > fi > > if [ "x$LSB_HOSTS" != "x" ] ; then > echo "LSF Hosts: $LSB_HOSTS" > HOST_NODEFILE=`pwd`/lsf_nodefile.$$ > for host in ${LSB_HOSTS} > do > echo $host >> ${HOST_NODEFILE} > done > fi > > if [ "x$HOST_NODEFILE" = "x" ]; then > echo "No hosts file defined. Exiting..." > exit > fi > > echo > "***********************************************************************" > CPU_NEEDED=`cat $HOST_NODEFILE | wc -l` > echo "Node count: $CPU_NEEDED" > echo "Nodes in $HOST_NODEFILE: " > cat $HOST_NODEFILE > echo > "***********************************************************************" > > echo > "***********************************************************************" > CPU_NEEDED=`cat $HOST_NODEFILE | wc -l` > echo "Checking ssh for each node:" > NODES=`cat $HOST_NODEFILE` > for host in ${NODES} > do > echo "Checking $host..." > ssh $host hostname > done > echo > "***********************************************************************" > > echo > "***********************************************************************" > echo "Executing $EXE with mpiexec" > chmod 755 $EXE > mpiexec `pwd`/$EXE > mpiexec.out 2>&1 > echo > "***********************************************************************" > > echo > "***********************************************************************" > echo "Executing $EXE with mpirun" > chmod 755 $EXE > mpirun -np $CPU_NEEDED -machinefile $HOST_NODEFILE `pwd`/$EXE > echo > "***********************************************************************" > > MPItest.c: > > /* hello.c > * > * Simple "Hello World" program in MPI. > * > */ > > #include "mpi.h" > #include <stdio.h> > int main(int argc, char *argv[]) > { > int numprocs; /* Number of processors */ > int procnum; /* Processor number */ > /* Initialize MPI */ > MPI_Init(&argc, &argv); > /* Find this processor number */ > MPI_Comm_rank(MPI_COMM_WORLD, &procnum); > /* Find the number of processors */ > MPI_Comm_size(MPI_COMM_WORLD, &numprocs); > printf ("Hello world! from processor %d out of %d\n", procnum, numprocs); > /* Shut down MPI */ > MPI_Finalize(); > return 0; > } -- ------------- Marco Verlato Istituto Nazionale di Fisica Nucleare - Sez. di Padova Via Marzolo 8 - 35131 Padova - ITALY Phone +39 049 827 7165, Fax +39 049 827 7102, Email: [log in to unmask]