Print

Print


The drain time for a farm is too long to be considered for anything but
the most dire interventions.

You don't need to drain the queues, just temporarily stop the jobs.  For
PBS/torque: 

	qsig -s STOP `qselect -q whetever -s R` 

reboot, and restart the jobs 

	qsig -s CONT `qselect -q whatever -s R`


Martin. 


> -----Original Message-----
> From: LHC Computer Grid - Rollout 
> [mailto:[log in to unmask]] On Behalf Of 
> Maarten Litmaath
> Sent: 27 June 2005 16:17
> To: [log in to unmask]
> Subject: Re: [LCG-ROLLOUT] rebooting a CE
> 
> 
> Jeff Templon wrote:
> 
> > Hi *,
> > 
> > We need to reboot our CE soon (kernel upgrade).  Used to be if you 
> > rebooted a CE machine, condor-G on the WMS would decide 
> that your jobs 
> > must all be dead, and restart them elsewhere.
> 
> That is a long time ago.  These days jobs in steady state are not
> affected by a reboot of the CE or the RB.  Jobs in transit (e.g.
> just finishing) will fail.
> 
> > What is the situation now?  Do we need to drain queues 
> before rebooting?
> 
> Draining the queues is always a good idea.
>