Print

Print


Hello,

I had this problem when I tried the "existing farm" method a while ago. I
didn't manage to solve it before I went on holiday, and I haven't had time
to go back to it since then, but I'll tell you what I did find out in my
case, just in case it helps.

I did install globus-url-copy on the WN (I also tried the ssh composite
route but couldn't make it work). The globus-url-copy process sat there
for a long time retrying several times before giving up and resulting in
the "Failure while executing jobwrapper" message.

One odd thing I noticed was that the globus-url-copy process was owned by
user "3000", not "gridpp000", according to "ps", even though I had added
the pool accounts from the CE to /etc/passwd and /etc/shadow on the WN.
Do you see anything similar, and/or does anyone know why this might be,
and if it could be relevant to the problem?

(The problem I spotted before that was that the wrong home directories
were mounted on the WN, but you would probably have noticed that by now.)

Cheers,

Ben

On Fri, 6 Jun 2003, Rod Walker wrote:

> Hi,
> Could this be that the wrapper script need to have globus-url-copy in
> the path in order to drag the input sandbox from the broker.
> In short unless there`s some other cleverness the worker nodes at leat
> need globus installed.
>
> There have been attempt to overload this command with some ssh composite
> that runs the globus-url-copy on the headnode, but I don`t know if
> anyone got this to work.
>
> Cheers,
> Rod.
>
> D.Kant wrote:
>
> >Hi Steve,
> >
> >We were testing whether the QMUL EDG frontend can succesfully submit jobs
> >to a private farm configured according to Andrew Mcnab's description given
> >at: http://www.gridpp.ac.uk/tb-support/existing/
> >
> >Local job submission using "globus-job-run" work fine.
> >
> >[kant@ui]globus-job-run hepbf4.ph.qmul.ac.uk:2119/jobmanager-pbs /bin/hostname
> >cn001.esc.qmul.ac.uk
> >
> >cn001 is a worker node on this farm running RH9.0 and is completely
> >standalone and has nothing in the way of EDG software installed on it.
> >
> >However, after successful job submission via a RB using the "dg-job-submit"
> >method, the job aborts:
> >
> >dg_JobId                =
> >https://gm03.hep.ph.ic.ac.uk:7846/138.37.50.249/14334867785261?gm03.hep.ph.ic.ac.uk:7771
> >Status                  =    Aborted
> >Last Update Time (UTC)  =    Fri Jun  6 15:36:53 2003
> >Job Destination         =    hepbf4.ph.qmul.ac.uk:2119/jobmanager-pbs-S
> >Status Reason           =    Failure while executing job wrapper.
> >Job Owner               =    /O=Grid/O=UKHEP/OU=ph.qmul.ac.uk/CN=D.Kant
> >Status Enter Time (UTC) =    Fri Jun  6 15:36:53 2003
> >*************************************************************
> >
> >Cheers, Dave.
> >
> >
> >
> >On Fri, 6 Jun 2003, Steve Traylen wrote:
> >
> >
> >
> >>It's kind of up, You don't actually need logging and bookeeping to
> >>submit a job and get the output. That's intresting bookeeping.
> >>
> >> Steve
> >>
> >>On Fri, 6 Jun 2003, Stephen Burke wrote:
> >>
> >>
> >>
> >>>On Fri, 6 Jun 2003, D.Kant wrote:
> >>>
> >>>
> >>>>  Is gm03 playing up??
> >>>>
> >>>>
> >>>You might like to check this web page for the status of the three
> >>>general-use brokers - this suggests that IC is the only one which is up,
> >>>but the last update was 50 minutes ago ...
> >>>
> >>>http://marianne.in2p3.fr/datagrid/WEB/RB_STATUS.html
> >>>
> >>>Stephen
> >>>
> >>>
> >>>
> >>--
> >>Steve Traylen
> >>[log in to unmask]
> >>http://www.gridpp.ac.uk/
> >>
> >>
> >>
> >
> >--
> >--------------------------------------------------------------
> >Department of Physics            | Dr Dave Kant
> >Queen Mary College               | TEL/FaX: +44 (0)20 7882 5054
> >Mile End Road  London  E1 4NS    | e-mail : [log in to unmask]
> >--------------------------------------------------------------
> >
> >
>

--
Dr Ben Waugh                                     Tel. +44 (0)20 7679 3783
Dept of Physics and Astronomy                    Internal: 33783
University College London
London WC1E 6BT