Hola Antonio,
>>> Jul 22 15:17:51 lcg02 dgas-add-record[26141]:
>>> /opt/lcg/sbin/dgas-add-record: cannot open '/opt/edg/var/gatekee
>>> per/jobs/1279804636:lcgpbs:internal_284714274:25092.1279804632' for
>>> reading: No such file or directory
>> Do you also see a successful processing of that internal job ID?
> No we don't see further processing in the logs (see below) but it might
> be that the job is still sent to the batch system (that's what we've
> seen although I'm not sure if this has happened with this job in
> particular).
The exit status of dgas-add-record is ignored for historical reasons,
so the job could have been submitted to the batch system indeed:
can you check if a job was submitted around that time?
> [root@lcg02 ~]# grep
> '1279804636:lcgpbs:internal_284714274:25092.1279804632' /var/log/messages
> Jul 22 15:17:16 lcg02 gridinfo[25092]: JMA 2010/07/22 15:17:16
> GATEKEEPER_JM_ID 2010-07-22.15:17:12.0000025030.0000000000 has
> GRAM_SCRIPT_JOB_ID 1279804636:lcgpbs:internal_284714274:25092.1279804632
> manager type lcgpbs
> Jul 22 15:17:51 lcg02 dgas-add-record[26141]:
> /opt/lcg/sbin/dgas-add-record: cannot open
> '/opt/edg/var/gatekeeper/jobs/1279804636:lcgpbs:internal_284714274:25092.1279804632'
> for reading: No such file or directory
If the job was submitted to the batch system, there should have been
another line like this:
Jul 22 15:17:52 lcg02 gridinfo: [593-10976] Submitted job
1279804636:lcgpbs:internal_284714274:25092.1279804632
to batch system lcgpbs with ID .....
I have been trying to understand how the file may get deleted prematurely.
Your CE essentially looks OK...
>> Does this happen for multiple users?
> Yes.
Is there any pattern, e.g. in the time stamps?
|