Hi all,
On gCE , since production update 21 & 22 , i have a problem that for all VOs
the sgm accounts (like dteamsgm001) jobs get executed ok , but users that
are mapped to default pool acounts (like dteam001) don't get their job
executed. I did not check other dod default accounts like dteamprd001.
I have configured new format of users.conf as described at:
https://twiki.cern.ch/twiki/bin/view/LCG/YaimGuide301#users_conf
All seems to be configured ok(at pps site i have similar configuration
and all works ok there). I have reconfigured several times with:
/opt/glite/yaim/bin/yaim -c -s site-info.def -n gliteCE -n BDII_site -n TORQUE_server
/etc/init.d/gLite restart
without any errors.
- From gCE the qsub for default accounts like dteam001 work ok.
- Also:
globus-job-run cs-grid1.bgu.ac.il /bin/hostname
from UI work ok.
- For glite-job-submit the users are authenticaed & their FQANs are mapped
correctly(and all lcmaps related conf is correct). But it looks like from
condor-c the job is never submited to the pbs. The condor-c job_queue.log
for dteam001 & dteamsgm001 looks the same, except in the end:
job_queue.log for dteam001
-----------------------------
...
103 3.0 StageInFinish 1177744953
103 3.0 ReleaseReason "Data files spooled"
103 3.0 JobStatus 1
103 3.0 LastHoldReason "Spooling input data files"
103 3.0 EnteredCurrentStatus 1177744953
103 3.0 LastSuspensionTime 0
103 3.0 PeriodicHold FALSE
103 3.0 PeriodicRelease FALSE
103 3.0 OnExitHold FALSE
103 3.0 OnExitRemove TRUE
-----------------------------
job_queue.log for dteamsgm001
-----------------------------
...
103 18.0 StageInFinish 1177789007
103 18.0 ReleaseReason "Data files spooled"
103 18.0 JobStatus 1
103 18.0 LastHoldReason "Spooling input data files"
104 18.0 HoldReason
104 18.0 JobStatusOnRelease
103 18.0 EnteredCurrentStatus 1177789007
103 18.0 LastSuspensionTime 0
106
105
103 18.0 Managed TRUE
106
105
103 18.0 RemoteJobId "pbs/20070428/32.cs-grid1.bgu.ac.il"
106
105
103 18.0 JobStatus 2
103 18.0 EnteredCurrentStatus 1177789144
106
105
103 18.0 ExitCode 0
103 18.0 JobStatus 4
103 18.0 EnteredCurrentStatus 1177789504
106
105
103 18.0 Managed FALSE
106
105
103 18.0 CompletionDate 1177789510
106
103 18.0 JobFinishedHookDone 1177789510
105
103 18.0 Managed TRUE
106
105
103 18.0 RemoteJobId "pbs/20070428/32.cs-grid1.bgu.ac.il"
106
105
103 18.0 Managed FALSE
106
105
103 18.0 Managed TRUE
106
105
103 18.0 RemoteJobId "pbs/20070428/32.cs-grid1.bgu.ac.il"
106
105
103 18.0 Managed FALSE
106
105
103 18.0 Managed TRUE
106
105
103 18.0 RemoteJobId "pbs/20070428/32.cs-grid1.bgu.ac.il"
106
105
103 18.0 Managed FALSE
106
103 18.0 StageOutStart 1177789545
103 18.0 StageOutFinish 1177789545
105
103 18.0 Managed TRUE
106
105
103 18.0 RemoteJobId "pbs/20070428/32.cs-grid1.bgu.ac.il"
106
105
103 18.0 Managed FALSE
106
105
103 18.0 Managed TRUE
106
105
103 18.0 RemoteJobId "pbs/20070428/32.cs-grid1.bgu.ac.il"
106
105
103 18.0 Managed FALSE
106
105
103 18.0 LeaveJobInQueue FALSE
106
105
103 18.0 Managed TRUE
106
105
103 18.0 RemoteJobId "pbs/20070428/32.cs-grid1.bgu.ac.il"
106
105
103 18.0 Managed FALSE
106
105
102 18.0
102 018.-1
106
-----------------------------
Does anyone have an idea where the problem may be?
ps. Maybe the following is somehow related to YAIM problem described at
https://gus.fzk.de/ws/ticket_info.php?ticket=21231
Thanks
Alex
|