Hi...
> GridStatusName ABORTED
> GridStatusReason reason=127.
> GridStatusTimeStamp2013-08-28 08:00:04
> Source tool
> GridEndStatusId ABORTED
> GridEndStatusReasonId reason=127.
> ...
>
> For more details:
> http://dashb-cms-job-dev.cern.ch/dashboard/request.py/detailView?jobId=645830651
>
> As far as I know it indicates a problem 'with staging of files from/to
> the CE node to/from the WN' (according to CREAM troubleshooting twiki).
> Is it also true for GE batch system? Because I've only found
> mentioning LSF or pbs/Torque in archives.
> Is there a detailed description for this error?
To be sure of the problem, try to find the CREAM JobID and search
through cream logs. They might point you to the right direction.
You can also check for leftovers in the pool accounts of the WNs and/or
CE. They might also indicate you the cause of the problem.
In our case, to pass CMS hammer cloud monitoring jobs, we had to tune
the ssh settings to allow multiple simultaneous connections from the WNs
to the CreamCE. You might check if that is not the problem in your case.
Here is an extract of my notes;
---*---
###
### Problem
###
Some jobs sent in bunches have in the stderr of the job the following
messages:
$ cat cream_888839038.e2726099
ssh_exchange_identification: Connection closed by remote host
ssh_exchange_identification: Connection closed by remote host
chmod: cannot access `./CREAM888839038_jobWrapper.sh': No such file or
directory
Because the copies between WNs and CEs are done in a concorrential way,
we need to
tune ssh connections to allow a higher number of connections
###
### Tune ssh
###
LoginGraceTime 3m
MaxStartups 200
---*---
>
> Other strange thing is, that a certain user's jobs are always failing
> at one of the CEs, but run fine on the others, though all of the CEs
> are identical (EMI 3, SL6).
>
Again, CREAM logs might indicate the reason why.
Try also to check if there is a problems with environment inherited by
tomcat and blahd. Check the following FAQ:
https://wiki.italiangrid.it/twiki/bin/view/CREAM/KnownIssues#CREAM_jobs_are_cancelled_with_st
Cheers
Goncalo
|