On Wed, 23 Mar 2005, Maarten Litmaath, CERN wrote:
> On Tue, 22 Mar 2005, Sotomayor, Maniel wrote:
>
> > Hello,
> >
> > I'm having problems after submitting jobs to my cluster. The jobs
> > successfully execute through qsub after installing MPICH. I'm having
> > errors when reading jobwrapper output. I checked the gocwiki that talks
> > about it, but have not solved it yet with them. I'm attaching the logging
> > info output. Can you help me solve this ?
>
> I verified that all your WNs can globus-url-copy to CE and RB, and that they
> can scp to the CE. A plain "qsub job-script" works, but I got unexpected
> complaints when trying this:
>
> ------------------------------------------------------------------------------
> $ globus-job-run ce.prd.hp.com /usr/bin/qsub -q dteam /tmp/foo.sh.1
> 960.ce.prd.hp.com
> Can't open -q: No such file or directory at /var/spool/pbs/submit_filter.pl
> line 13.
> Can't open dteam: No such file or directory at /var/spool/pbs/submit_filter.pl
> line 13.
> ------------------------------------------------------------------------------
Uh oh, /var/spool/pbs/submit_filter.pl is something new that seems to come only
with Cal's Torque release:
-----------------------------------------------------------------------------
$ globus-job-run ce.prd.hp.com /bin/rpm -qf /var/spool/pbs/submit_filter.pl
file /var/spool/pbs/submit_filter.pl is not owned by any package
$ globus-job-run ce.prd.hp.com /bin/rpm -qf /usr/bin/qsub
torque-clients-1.2.0p1-5.sl3.cl
-----------------------------------------------------------------------------
Either it is misconfigured, or it has broken one valid way to submit jobs...
> Furthermore, this also fails:
>
> ------------------------------------------------------------------------------
> $ globus-job-run ce.prd.hp.com/jobmanager-pbs -q short /bin/hostname
> Scientific Linux CERN Release 3.0.4 (SL)
> Permission denied, please try again.
> Permission denied, please try again.
> Permission denied (publickey,password).
> ------------------------------------------------------------------------------
>
> In the latter case it is as if the CE tried to ssh into the WN, which should
> not be needed for job submission.
>
> It appears the batch system is not configured correctly. Is there a Torque
> expert who can shed some light here? We will look further...
>
|