Hello ,
I have noticed that jobs submited through lxn1188 to GR-01-AUTH do not
appear to finish correctly. I have submited a few tests ,I saw them
coming into our cluster , running and finishing , but the rb does not
track this. It stays like this :
*************************************************************
BOOKKEEPING INFORMATION:
Status info for the Job :
https://lxn1188.cern.ch:9000/sRMU0mKA9BxP_dAe6U_LBQ
Current Status: Running
Status Reason: unavailable
Destination: node001.grid.auth.gr:2119/jobmanager-torque-dteam
reached on: Fri Mar 18 08:30:36 2005
*************************************************************
The job ids are :
https://lxn1188.cern.ch:9000/sRMU0mKA9BxP_dAe6U_LBQ
https://lxn1188.cern.ch:9000/K_AQ77CO1-bUPvFCx8eXaw
https://lxn1188.cern.ch:9000/uS5JXD0E6bZMCCoVAXRVUA
https://lxn1188.cern.ch:9000/VpkLbI5cXYNfTNLqbt5MaQ
All those jobs arrived into the cluster , ran and exited successfully.
I have submited jobs through other rbs , like lxn1177
https://lxn1177.cern.ch:9000/OqYDdG9FxMgELFoR7BOy-A
and they all finish without any problems. So I am puzzled about what is
wrong . My only hint is that yesterday night when I submited a few more
jobs through lxn1188 , one failed with a "No route to host" message but
I was unable to reproduce this ( Unfortunately I lost the jobid)
Best regards ,
--
============================================================================
Dimitris Zilaskos
Department of Physics @ Aristotle Univercity of Thessaloniki , Greece
PGP key : http://tassadar.physics.auth.gr/~dzila/pgp_public_key.asc
http://egnatia.ee.auth.gr/~dzila/pgp_public_key.asc
MD5sum : de2bd8f73d545f0e4caf3096894ad83f pgp_public_key.asc
============================================================================
|