Hi Martin,
I cut and paste your suggestion in wiki. I hope you don't mind.
cheers
alessandra
On Tue, 28 Jun 2005, Bly, MJ (Martin) wrote:
> The drain time for a farm is too long to be considered for anything but
> the most dire interventions.
>
> You don't need to drain the queues, just temporarily stop the jobs. For
> PBS/torque:
>
> qsig -s STOP `qselect -q whetever -s R`
>
> reboot, and restart the jobs
>
> qsig -s CONT `qselect -q whatever -s R`
>
>
> Martin.
>
>
>> -----Original Message-----
>> From: LHC Computer Grid - Rollout
>> [mailto:[log in to unmask]] On Behalf Of
>> Maarten Litmaath
>> Sent: 27 June 2005 16:17
>> To: [log in to unmask]
>> Subject: Re: [LCG-ROLLOUT] rebooting a CE
>>
>>
>> Jeff Templon wrote:
>>
>>> Hi *,
>>>
>>> We need to reboot our CE soon (kernel upgrade). Used to be if you
>>> rebooted a CE machine, condor-G on the WMS would decide
>> that your jobs
>>> must all be dead, and restart them elsewhere.
>>
>> That is a long time ago. These days jobs in steady state are not
>> affected by a reboot of the CE or the RB. Jobs in transit (e.g.
>> just finishing) will fail.
>>
>>> What is the situation now? Do we need to drain queues
>> before rebooting?
>>
>> Draining the queues is always a good idea.
>>
>
--
********************************************
* Dr Alessandra Forti *
* Technical Coordinator - NorthGrid Tier2 *
* http://www.hep.man.ac.uk/u/aforti *
********************************************
|