Hi Ricardo,
Ricardo Graciani wrote:
> PIC: Large number of aborted jobs (tonight) probably due to a
> single WN.
Havig a first look into our PBS logs, we see that 87 lhcb jobs finished
between 00:00 today and now, all of them reporting exit_status=0.
Looking at yesterday's logs, we see that between ~12:00 and 12:30 about
50 lhcb jobs exited with exit_status=271, so it looks as if they were
deleted by you.
All of these jobs have run in different WNs, so at first sight there is
nothing pointing into a problemating WN causing LHCB jobs to abort.
It would be useful for us if you could give us the JM ids for some of
the jobs that failed at PIC, so that we could follow their trace.
thanks a lot,
Gonzalo
--
Gonzalo Merino ([log in to unmask])
Institut de Física d'Altes Energies (UAB)
08193 Bellaterra (Barcelona) SPAIN
Tel: +34 93 5813322 / Fax: +34 93 5814110
|