Hi all,
We are having problems running SAM tests. We are failing about
50% of the SAM tests running on the same Worker Node.
We have noticed that the jobs are failing (with error code 1)
on a 'globus-url-copy' command on the Worker Node, trying to copy the
file 'cache_export_dir.tar' from our CE.
E.g the command that is run on the WN is
# globus-url-copy gsiftp://grid002.jet.efda.org/home/opssgm/.lcgjm/
globus-cache-export.r12606/cache_export_dir.tar file:///home/opssgm/
globus-tmp.grid054.19373.0/globus-tmp.grid054.19373.1/cache_export_dir.tar
We get the following error (in the batch.err file that is sent to the CE from
the WN)
*********
submit-helper script running on host grid054 gave error: cache_export_dir
(/home/opssgm/.lcgjm/globus-cache-export.r12606) on gatekeeper did not contain
a cache_export_dir.tar archive
*********
As I mentioned above, the 'globus-url-copy' works OK for about 50% of the
SAM tests which run successfully (on the same worker node)
We have run tens of 'globus-url-copy' tests by hand and they all
work OK.
Our CE and WN are both time synchronized.
Does anybody know what could be going wrong here.
Many Thanks
krishan
|