Hallo Christoph,
> > Did you modify any other parameters besides the debug flag?
> >
> > Does the problem occur for all users or a subset, e.g. "power" users?
> >
> We put the gma parameters a mit more aggressive on one CE
>
> fileage 43200
> stateage 300
> tout 60
> tick 120
> debug 1
> # debug 0
> logf /var/log/globus/globus-gma.log
>
> On the other CE we run default values, but observed the problem on both
> systems so far.
Can you try setting some parameters to _less_ aggressive values,
according to the advice given here:
https://twiki.cern.ch/twiki/bin/view/EGEE/LcgCE
For example:
stateage 600
tout 120
tick 300
Or go even further:
stateage 600
tout 600
tick 600
This way the batch system query load should go down by a lot.
Of course it means the WMS nodes will be updated a lot less often,
but that should not hurt normal jobs. The idea is that the CE is
better protected when the update frequency is lower.
> Concerning your second question I have no firm answer, but my feeling
> is that power user are more affected, since SAM tests and our local
> tests usually start, although with quite some delay.
Power users cannot expect frequent updates on their many jobs:
only throughput and stability should matter for batch activity.
In general the LCG-CE is not suited for real-time jobs.
|