Hi, Vangelis,
Regarding your mail:
1) MPI and MPICH are different, which one you want to support? Beowulf
normally goes to "MPI".
2) Not sure what you exactly mean "have been publishing the MPICH tag
for HG-01-GRNET"?
Do you mean you just setup the RunEnvironmnet of CE? Need to be
checked:
a) all worker nodes need to share CE /home directory.
b) setup "PASSWORDLESS" ssh login between CE and WN, _AND_,
between WN and WN (this is very important to run a MPI job on your
site).
c) start your PBS Jobmanager.
Regard
Wei
U. of Cyprus
Vangelis Koukis wrote:
>Hello all,
>
>I have been experimenting with support for MPICH type jobs under LCG
>2.3.0. We have been publishing the MPICH tag for HG-01-GRNET, and
>job-list-match lists our CE in the candidates for execution, when a JDL
>with JobType="MPICH" is provided.
>
>However, job submission fails with:
>
>*************************************************************
>BOOKKEEPING INFORMATION:
>
>Status info for the Job : https://lxn1188.cern.ch:9000/3XP1kH4KzLDepHEbkgWhxg
>Current Status: Aborted
>Status Reason: Cannot plan: JobAdapterHelper: invalid value torque for attribute lrms_type (expecting lsf or pbs)
>reached on: Tue Jan 18 13:56:47 2005
>*************************************************************
>
>which seems to be a result of LCG 2.3.0 using Torque instead of PBS.
>The error message stays the same, when trying to execute the same job on
>other CEs advertising MPICH execution capability (by specifying them
>explicitly in the JDL).
>
>Also, trying to compare the available options for integrating MPICH
>support with PBS/Torque, I came across the following link:
>
>http://www.beowulf.org/archive/2005-January/011535.html
>
>which essentially describes mpiexec as a much better alternative compared
>to mpirun for spawing application instances across worker nodes managed
>by PBS. It uses PBS directly to start them, instead of rsh/ssh, thus
>allowing for better monitoring and resource accounting. Does anyone have
>experience with that kind of configuration?
>
>Thanks in advance.
>
>
>
>------------------------------------------------------------------------
>
>_______________________________________________
>Egee-sa1-tech mailing list
>[log in to unmask]
>https://mailman2.grnet.gr/mailman/listinfo/egee-sa1-tech
>
--
============================================================
Wei Xing, M.Sc.
Research Associate Tel: 00357-22892663
Dept. of Computer Science Fax: 00357-22892701
University of Cyprus email: [log in to unmask]
PO Box 20537
CY1678, Nicosia, CYPRUS
|