Hi,
Thanks for your input.
Gonçalo Borges a écrit :
> This means that the edg-wl-jc daemon died... Try to restart it but I
> guess it will die again because there is some inconsistency with it...
> If this happens again, I guess you will have to clean the input.fl file
> in /var/edg/job-controller and restart the daemon again...
> But this is something a little bit hard and I'm not sure what will be
> the consequence to all running jobs.
OK, I keep that recipe too.
> Which condor version are you using on the LCG-RB?
# rpm -qa | grep -i condor
ncm-condorconfig-1.0.2-1
vdt_globus_jobmanager_condor-VDT1.2.2rh9_LCG-3
condor-lcg-1.1.0-1
condor-lcgrb-1.0.0-3
condor-6.7.10-1
So I believe condor 6.7.10
> There was an email exchange in LCG-ROLLOUT which may be interesting to
> you with the following thread
> [LCG-ROLLOUT] LCG Ressource Broker and job expiration
> and on Date: Thu, 12 Apr 2007 10:58:41 +0200
I'll read that one.
Something I did that seems to have "solved" the problem (for now, lets
hope), that I got from elsewhere:
- Stopping the proxy-renewal daemon
- cd /opt/edg/var/spool/edg-wl-renewd
rm -f `ls | grep -E '*\.[0-9]+'`
- Starting the daemon again.
Dont know why, but it seems to help.
I also went through all daemons to see if some were stopped (some where)
and I restarted them. But unfortunately I didn't kept track of exactly
what I did...
Thanks for your help.
Yannick
--
Yannick Patois <[log in to unmask]>
IPHC - IN2P3 / CNRS - 23 rue du Loess 67037 Strasbourg
Tel: 03 88 10 61 83
|