Hello all,
I ran into some problems with SGE under gLite which I hope you could help
me with. The submitted jobs finish on WNs, but they remain in Running
state on the WMS.
I set up and configured a separate CE and SGE Qmaster, and some test WNs
as described in the gLite generic install guide:
https://twiki.cern.ch/twiki/bin/view/LCG/GenericInstallGuide310#The_SGE_batch_system
The ldapsearch and lcg-infosites outputs were OK, so I started submiting
test jobs.
According to the glite-wms-job-status, every job reached the state of
running. I checked these jobs at the WNs, they were definitely running,
and the globus tmp files were all set in the mapped user folder.
When the job finished, the qmaster accounted it and the stdout and stderr
files were created on the CE for this job in /home/<mapped user>/.globus.
The stdout file:
lcg-jobwrapper-hook.sh not readable
Take token:
UI=000000:NS=0000000004:WM=000004:BH=0000000000:JSS=000003:LM=000000:LRMS=000004:APP=000000:LBS=000000
job exit status = 0
jw exit status = 0
and the stderr file:
/home/hungrid008/globus-tmp.grid237.21101.0/globus-tmp.grid237.21101.2:
line 61: : No such file or directory
/opt/glite/bin/glite-lb-logevent: edg_wll_LogEvent*(): Invalid argument
(edg_wll_LogEvent():
Invalid argument;; Logging library ERROR:
Invalid argument;; edg_wll_DoLogEvent(): Error code mapped to EINVAL
Message incomplete;; edg_wll_log_read(): answer read from locallogger)
There is another stdout file which is continously being appended with
2009-04-14 16:48:02 OK:
2009-04-14 16:49:02 OK:
2009-04-14 16:50:02 OK:
2009-04-14 16:51:02 OK:
and so on...
Do you have any idea why these jobs never reach finished state on the WMS?
Everything seems to be working except the notification of the WMS.
Cheers,
Somhegyi Bence
|