On Wed, 2003-12-17 at 04:52, Daniels, T (Trevor) wrote:
> This will be the last report this year. All times are UTC (GMT).
>
> BNL
> Via Globus:
> GRAM Job submission failed because the connection to the server failed
> (check host and port) (error code 12)
> Last successful globus submission was at 02:00 on 16 Dec
> Via CERNRB:
> Current Status: Aborted
> Status Reason: Cannot plan: BrokerHelper: no compatible resources
> Last successful RB submission was at 22:00 on 9 Dec
As I have said before, this is because the grid-monitor or
globus-job-manager scripts appear to be getting stuck and might not be
returning the job results. They are also not being cleaned causing the
CE to run out of memory several hours after being booted. I do not know
what is causing the problem here. Someone suggested that it might be a
known problem of condor-g, but that doesn't explain why BNL is failing
the GOC Cern RB monitoring while other sites succeed.
NOTE: The last time the Cern RB succeeded was on Dec 9th, and that is
the same day that I upgraded to LCG1-1_1_3. Coincidence?
~Jason
--
/------------------------------------------------------------------\
| Jason A. Smith Email: [log in to unmask] |
| Atlas Computing Facility, Bldg. 510M Phone: (631)344-4226 |
| Brookhaven National Lab, P.O. Box 5000 Fax: (631)344-7616 |
| Upton, NY 11973-5000 |
\------------------------------------------------------------------/
|