Maarten, if I may, I have to ask some additional questions.
As far as I can see from "submit-helper.pl", it is required to have
globus-url-copy by the path "GLOBUS_LOCATION"/bin/globus-url-copy, otherwise
it is exiting! This is perfectly OK for us, to have globus-url-copy on all
of the WN. But I always thought that if I have a home sharing I shouldn't be
forced to use globus-url-copy for output transfer from WN to CE.
In my case JM always using submit helper to execute a job, but how about
home sharing? Is it our wrong configuration or submit-helper is always used
to precede a job in any case?
Actually, your last mail has proofed my suspicion about the reason of our
(GSI's) problem. The reason is that not all of the WN of our farm have
globus-url-copy by the path. This is leading to cancel of execution of
"submit-helper.pl" and getting the error "Cannot read JobWrapper output...".
Certainly we going to fix this problem and get globus-url-copy for all of
the machines, but my point would be, that I am not really willing (let say I
don't see the point) of allowing WN to submit output DERECTLY to RB, for
this I have a shared home and I have a CE, I think.
Maybe I misunderstood something. Please, clarify the situation for me...
1 - Can I use home sharing with LCG 2.3 so as WN will copy the output to CE
and than CE will send it to RB?
2 - Is "submit-helper.pl" used in any case with and without home sharing?
Thank you very much in advance!
Cheers,
Anar
> -----Original Message-----
> From: Anar Manafov [mailto:[log in to unmask]]
> Sent: Sunday, February 06, 2005 1:12 PM
> To: [log in to unmask]
> Cc: [log in to unmask]; [log in to unmask]
> Subject: RE: [LCG-ROLLOUT] Cannot read JobWrapper output +
>
> Good day, Maarten!
>
> Thank you very much for your response. First of all I want to say that
> your suggestions were always VERY useful for me! And I always appreciate
> to get your comments and suggestions!!!
>
>
> Indeed, I met this page:
> http://goc.grid.sinica.edu.tw/gocwiki/Cannot_read_JobWrapper_output%2e%2e%
> 2e
>
> But didn't find anything suitable for my case :(
>
> > Of course it is possible that the job did finish, but then it must mean
> > that:
> >
> > 1. the WN could not do a globus-url-copy to the RB, *and*
> >
> > 2. Globus could not send back the job wrapper stdout, e.g. because it
> > was not copied back from the WN to the CE, or because globus-url-copy
> > does not work from the CE to the RB.
> >
> > This combined set of problems still can have a single cause: there can
> > be a firewall limiting outgoing connections (to ports 20000-25000),
> > some CRLs can be out of date both on CE and WN, some CA files could be
> > absent altogether, the time (zone) on CE and WN can be wrong, ...
> [Anar Manafov]
> 1 - I also suspect it is output problem (this I can say after debugging
> the full chain of job submission process), and globus-url-copy could be a
> reason. I will recheck ones more. It seems something seriously wrong with
> gridftp, but what exactly, this I can't understand.
> 2 - This is probably our case. We have home sharing, so hopefully there
> should be no problem to copy stdout back to CE. :) BUT! globus-url-copy
> could be a problem!
>
> As far as I know our FW is not filtering outgoing connection. But I will
> certainly recheck that for LSF nodes.
> Also I will recheck CA files and globus-url-copy functionality on WN and
> CE.
> Time synchronization is checked and it is OK.
>
> I also saw this page:
> http://goc.grid.sinica.edu.tw/gocwiki/submit-
> helper_script_%2e%2e%2e_gave_error%3a_cache_export_dir_%2e%2e%2e
> It is shouldn't be our case, because of home sharing we have... right?
>
>
> Thank you very much Maarten!
>
> Cheers,
> Anar
|