Salut David,
> I have reinstalled the whole wms02 from scratch to
Can you enable the "ops" VO?
> try to fix the issue but I keep the same problem.
> So, if I do the request, the WMS02 is not able to find this job anymore.
OK, is there more output for yesterday's job now:
glite-wms-job-logging-info -v 2 \
https://wms02.begrid.be:9000/VPH54KBWMmx1qo1xYJgFTw
> If I correctly understood the problem:
> 1 - wms02 is sending correctly the job to our CE (ce03.hp.begrid.be) ->
> I suppose that certificates are checked at this step;
> 2 - ce03 found a valid WN and execute the job;
> 3 - the WM finish the execution but the job remain blocked in "running".
Why did you think the job was finished? Did it produce output somewhere?
Anyway, the batch system reported the job as running, which should mean
that it had not fully finished. We would have to run "ps afuxwww" or so
on the WN to see what such a job is doing.
> If we took wms.begrid.be (the "old" wms on the same hardware with same
> OS than wms02), everything is working with the same job, same ce and wns.
>
> For me the problem could then only be WMS related. But, is my intuition
> correct?
It looks like a WMS problem at first glance. What happens if you send
a job via wms02 to a different CE, e.g. at another site?
|