What is the "nice way" to bring down an entire site?
Ideally, I would like to be able to:
On the SE:
1) Check that no one is using the SE for file transfers
2) Stop any further EDG interaction (i.e. take the SE off-line)
3) Do any node specific updates (probably none)
4) Reboot the SE
On the CE:
1) Check that PBS is clear, and if not either insert a "don't accept any
more jobs" message, or "nicely" (if there is such a thing) kill any active
jobs (e.g. node _must_ be brought down, but there are long running jobs in
the queue).
2) Stop any further EDG interaction (i.e. take CE off-line)
3) Do any node specific updates (probably none)
4) Reboot the CE and all WNs.
A subset of this is just bringing down individual PBS nodes (i.e. WNs). I
understand that there is a PBS command to do this (probably qdisable), but
when I try to execute it from the CE on one of my WNs, I get:
[root@tbce01 bin]# ./qdisable @wont3.physics.ox.ac.uk
Connection refused
qdisable: could not connect to server wont3.physics.ox.ac.uk (111)
TIA,
Ian.
--
Ian Stokes-Rees [log in to unmask]
Particle Physics, Oxford http://www-pnp.physics.ox.ac.uk/~stokes/
|