Hi Ali,
> At our site PK-CIIT, we are facing the warning messages for the last two days
> on nagios page. The warnings are
>
> **************************************
> cern-56-24-249.comsats.edu.pk: UNKNOWN: METRIC FAILED [org.sam.WN-RepCr]:
> UNKNOWN: failed on LFC prod-lfc-shared-central.cern.ch [ErrDB:[('default',
> 'client', 'UNKNOWN')]]
> UNKNOWN: METRIC FAILED [org.sam.WN-RepCr]: UNKNOWN: failed on LFC
> prod-lfc-shared-central.cern.ch [ErrDB:[('default', 'client', 'UNKNOWN')]]
> Testing from: cern-56-24-249.comsats.edu.pk
> DN: /C=TW/O=AP/OU=GRID/CN=tz ke wu
> 156951/CN=proxy/CN=proxy/CN=proxy/CN=proxy/CN=proxy/CN=proxy/CN=limited proxy
> VOMS FQANs: /ops/Role=lcgadmin/Capability=NULL,
> /ops/ROC/Role=NULL/Capability=NULL,
> /ops/ROC/AsiaPacific/Role=NULL/Capability=NULL, /ops/Role=NULL/Capability=NULL
> Invoking metric: [2012-02-25T14:45:58Z] org.sam.WN-RepISenv
> Invoking metric: [2012-02-25T14:45:58Z] org.sam.WN-RepFree
> Invoking metric: [2012-02-25T14:46:01Z] org.sam.WN-RepCr
> METRIC FAILED [org.sam.WN-RepCr]: UNKNOWN: failed on LFC
> prod-lfc-shared-central.cern.ch [ErrDB:[('default', 'client', 'UNKNOWN')]]
>
> ********************************************
Click on the "org.sam.WN-RepCr-/ops/Role=lcgadmin" link in the Service
column (i.e. the other failing test) to see more details:
-----------------------------------------------------------------------------
[...]
Check if we can write to LFC prod-lfc-shared-central.cern.ch
Couldn't write to given LFC prod-lfc-shared-central.cern.ch.
Reason:
send2nsd: NS002 - send error : Bad credentials
[...]
-----------------------------------------------------------------------------
That error suggests the proxy had already expired (or disappeared).
That would be due to a problem with the batch system or the CE.
I then sent a simple test job to your CE and it failed:
-----------------------------------------------------------------------------
****** JobID=[https://cern-56-24-244.comsats.edu.pk:8443/CREAM289745994]
Status = [DONE-FAILED]
ExitCode = [0]
FailureReason = [reason=1; Cannot move OSB (${globus_transfer_cmd}
file:///home/ops016/home_cream_289745994/CREAM289745994/hello-3920.err
gsiftp://cern-56-24-244.comsats.edu.pk/opt/glite/var/cream_sandbox/ops/
_DC_ch_DC_cern_OU_Organic_Units_OU_Users_CN_litmaath_CN_410032_CN_Maarten
_Litmaath_ops_Role_NULL_Capability_NULL_ops016/28/CREAM289745994/OSB/
hello-3920.err): error: globus_ftp_client: the server responded with an error
500 Command failed. : globus_xio: An end of file occurred]
-----------------------------------------------------------------------------
It could not move its output sandbox from the WN to the CE.
You need at least to fix that. These documentation links may help:
http://grid.pd.infn.it/cream/field.php?n=Main.CREAMTroubleshooting
https://wiki.egi.eu/wiki/Tools/Manuals/TS63 (the diagnosis applies)
|