Bonjour Jean-Bernard,
> DTEAM_GROUP_ENABLE="
> dteam
> /dteam/ROLE=lcgadmin
> /dteam/ROLE=production
> "
> OPS_GROUP_ENABLE="
> ops
> /ops/ROLE=lcgadmin
> "
> ESR_GROUP_ENABLE="
> esr
> /esr/ROLE=lcgadmin
> /esr/ROLE=production
> "
> EGEODE_GROUP_ENABLE="
> egeode
> /egeode/ROLE=lcgadmin
> /egeode/ROLE=production"
> XRAY_GROUP_ENABLE="
> xray.vo.eu-egee.org
> "
> FORMATION_GROUP_ENABLE="
> vo.formation.idgrilles.fr
> "
>
>
> and the groups.conf file exactly like the YAIM example with addition of
> VO esr xray egeode and formation.
Indeed, I was able to qsub a job directly as "opssgm":
[opssgm@ce1 ~]$ echo date | qsub -q ops
757.ce1.egee.fr.cgg.com
But then the shell reported: You have new mail in /var/mail/opssgm.
Indeed, the last message has this:
-----------------------------------------------------------------------------
PBS Job Id: 757.ce1.egee.fr.cgg.com
Job Name: STDIN
An error has occurred processing your job, see below.
Post job file processing error; job 757.ce1.egee.fr.cgg.com on host
r003n119.private.egee.fr.cgg.com/0
Unable to copy file /var/spool/pbs/spool/757.ce1.ege.OU to
[log in to unmask]:/home/opssgm/STDIN.o757
>>> error from copy
Host key verification failed.
lost connection
>>> end error output
Output retained on that host in: /var/spool/pbs/undelivered/757.ce1.ege.OU
-----------------------------------------------------------------------------
Maybe WN r003n119 still has the old host key of the CE?
Check /etc/ssh/ssh_known_hosts on that node, remove obsolete keys and
run /etc/cron.d/edg-pbs-knownhosts manually.
> but with the SAM submission tool on the page :
> https://cic.gridops.org/index.php?section=roc&page=samadmin
> I do not succeed to submit SAM/OPS test job to the CE
> ce1.egee.fr.cgg.com on the site, it says "no compatible resources"
If a job fails and the WMS finds the job's "token" file still present
(in the job's sandbox area), it means the job already exited before the
WMS job wrapper was started. In that case the WMS can try a shallow
resubmission (as allowed by the JDL), but that fails because of this bug:
https://savannah.cern.ch/bugs/?28235
It is fixed by patch #1841, which has been certified but cannot be
deployed before patch #2562 is certified as well (by mid January).
|