On Fri, 30 Sep 2005, Steve Traylen wrote:
> On Fri, Sep 30, 2005 at 03:25:36PM +0200 or thereabouts, Ahmed Beriache wrote:
> > Hi all,
> >
> > We have the follownig problem at CGG-LCG2 : When jobs containing
> > lcg-xx commands arrive on our Worker Nodes, some of lcg-xx commands
> > complete successfully but very often they hang for a very long time,
> > until the job proxy expires or the job is deleted by LRMS. But when we
> > log on the worker node where a lcg command is hanging and rerun it
> > manually , it works correctly and the job continue running (until the
> > next lcg command).
> >
> > We tried to unset GLOBUS_TCP_PORT_RANGE variable from workker nodes, but
> > this did not help.
>
> Hi Ahmed,
>
> Are you sure you actually did unset the GLOBUS_TCP_PORT_RANGE?
> If you just unset it in the profile on the WNs this is not enough
> since the profile is carried from the CE and then extended with
> the WN's profile.
>
> Run a real job and check your env. Also find one of the hung lcg- commands
> and look in /proc/<pid>/environ to check it really is unset.
>
> To really unset it we explicitly unset it in the lcgpbs.in job manager
> script.
Or rather create files like these:
-----------------------------------------------------------------------------
$ cat /etc/profile.d/unset_port_range.csh
unsetenv GLOBUS_TCP_PORT_RANGE
$ cat /etc/profile.d/unset_port_range.sh
unset GLOBUS_TCP_PORT_RANGE
-----------------------------------------------------------------------------
YAIM could be changed to do that optionally.
|