Hi,
I just received nagios warnings for two of my CEs. The error reads:
CRITICAL: Job was aborted.
CRITICAL: Job was aborted.
Testing from: gridppnagios.lancs.ac.uk
DN: /C=UK/O=eScience/OU=Oxford/L=OeSC/CN=kashif
mohammad/CN=proxy/CN=proxy/CN=proxy/CN=proxy
VOMS FQANs: /ops/Role=lcgadmin/Capability=NULL,
/ops/ROC/Role=NULL/Capability=NULL, /ops/Role=NULL/Capability=NULL
glite-wms-job-status https://lcglb01.gridpp.rl.ac.uk:9000/2CyS0-9e_toUlWW0rh4Y4A
======================= glite-wms-job-status Success =====================
BOOKKEEPING INFORMATION:
Status info for the Job :
https://lcglb01.gridpp.rl.ac.uk:9000/2CyS0-9e_toUlWW0rh4Y4A
Current Status: Aborted
Logged Reason(s):
- Cannot take token; reason=1; Timeout waiting for server response.
Closing connection to service. Timeout waiting for server response.
Closing connection to service. Timeout waiting for server response.
Closing connection to service. Cannot take token
- Cannot take token; Timeout waiting for server response. Closing
connection to service. Timeout waiting for server response. Closing
connection to service. Timeout waiting for server response. Closing
connection to service. Cannot take token; reason=1
Status Reason: hit job shallow retry count (1)
Destination: ceprod06.grid.hep.ph.ic.ac.uk:8443/cream-sge-grid.q
Submitted: Tue Oct 2 12:54:10 2012 BST
=========================================================================
glite-wms-job-logging-info -v 2
https://lcglb01.gridpp.rl.ac.uk:9000/2CyS0-9e_toUlWW0rh4Y4A
As far as I can tell there is nothing wrong with the machines, they
are not under load and no other hint of trouble.
Does someone have an inkling what is going on here ?
Cheers,
Daniela
--
Sent from the pit of despair
-----------------------------------------------------------
[log in to unmask]
HEP Group/Physics Dep
Imperial College
Tel: +44-(0)20-75947810
http://www.hep.ph.ic.ac.uk/~dbauer/
|