Hola Antonio,
> >>> Jul 22 15:17:51 lcg02 dgas-add-record[26141]:
> >>> /opt/lcg/sbin/dgas-add-record: cannot open '/opt/edg/var/gatekee
> >>> per/jobs/1279804636:lcgpbs:internal_284714274:25092.1279804632' for
> >>> reading: No such file or directory
> >> Do you also see a successful processing of that internal job ID?
In /opt/edg/var/gatekeeper/grid-jobmap_20100722 there is this evidence:
----------------------------------------------------------------------
"localUser=42078"
"userDN=....."
"userFQAN=/cms/Role=NULL/Capability=NULL"
"jobID=https://lb001.cnaf.infn.it:9000/e_iQiUr1aLEJcKFxY-ZX6Q"
"ceID=lcg02.ciemat.es:2119/jobmanager-lcgpbs-medium"
"lrmsID=1936759.gaebatch.ciemat.es"
"timestamp=2010-07-22 13:17:50"
----------------------------------------------------------------------
The job got successfully submitted 1 second (UTC) before the error,
which explains why dgas-add-record could not open the file any more.
We have at least these questions left:
- Why did lcgpbs not log the successful submission?
- Why did it run the submission twice?
Since it only happens for some jobs (?) a race condition is suspected.
But no other site has complained about such problems yet...
|