Hi all,
since a couple of days our ce log is full of this error messages:
# grep -c "FAILED during submission to batch system lcgpbs" /var/log/messages
1338
I'm looking for any guide that could give some clues on what's
happening, cause this error does not happen always, not will same kind
of user... so seems a "little" spurious (if 1338 errors could be
considered spurious :-) ).
ce's are not with a heavy load neither torque/maui server.
we have our queues full of jobs (about 1500), but we can sumit
(locally) much more.
any clue on this? may we start playing with globus-job-manager-marshal
config params?
TIA,
Arnau
|