Hi,
The problem appeared to be with a recent fetch-crl update which removed
a cron-job. Things have been corrected and now lcgce02 is passing the
SAM tests.
Cheers,
Catalin
> -----Original Message-----
> From: Testbed Support for GridPP member institutes [mailto:TB-
> [log in to unmask]] On Behalf Of [log in to unmask]
> Sent: 17 February 2010 09:17
> To: [log in to unmask]
> Subject: SAM test issues with lcg-CE at RAL
>
> Hi,
>
> We are experiencing some problems with a lcg-CE at RAL Tier1. It
> appears
> that jobs coming from WMSes outside RAL are never terminated. The
> errors
> are like the one below
>
> *************************************************************
> BOOKKEEPING INFORMATION:
> Status info for the Job :
> https://wms208.cern.ch:9000/kQY1kKjUIacCjeuCgiMQfA
> Current Status: Aborted
> Logged Reason(s):
> - File not available.Cannot read JobWrapper output, both from
> Condor
> and from Maradona.
> - File not available.Cannot read JobWrapper output, both from
> Condor
> and from Maradona.
> Status Reason: hit job shallow retry count (1)
> Destination:
> lcgce02.gridpp.rl.ac.uk:2119/jobmanager-lcgpbs-grid1000M
> Submitted: Wed Feb 17 08:35:51 2010 CET
> *************************************************************
>
> Jobs submitted via local WMS appear to be OK, so we suspect some
> miscommunications between external WMS and the CE or WNs.
>
> In Derek's absence I am trying to solve this problem so I am asking
> for
> any possible hints on this list. An option would be to restart
> services
> on (or reboot) that machine...
>
> Many thanks,
> Catalin Condurache
> RAL Tier1 Grid Services
|