Peter Gronbech wrote: > We use a combination of yum (to load updated rpms), yumit to advise us > on the patch status, and copy of an ssh key in [snip] > I can then spot that systems need patching with yumit, check what is > required (ie which oatches are missing, again with yumit) and the type > ont2nodes yum -y update > I'm sure there are many other solutions to this problem but this works > for me. Wow, that just made me realise "yet another complication of grid computing". Pete's system sounds excellent. Very simple to manage. I would imagine there could be big implications for running jobs, though, if the software they are using is changing under their feet. Are there risks that this might throw off software execution? I certainly imagine it might. We (LHCb) do a bunch of software version checks at the start of execution (and not just for LHCb/Physics software). Weird failures are one thing, but failures are probably better than producing bad results *without* any errors or output inconsistencies due to changes in software between two steps of a job. What happens if a library is updated? I don't know enough about how link-resolution and inodes work to understand whether dynamic libraries are all "referenced" at the start of execution, so the OS holds inode references to the old library, even if the physical file changes later (but still during execution of some process which is referencing it). I suppose ideally it would be good to "inject" update jobs into the queue, but then the three problems exist: 1. This means syncing on both (or all 4) processors, which will almost certainly will mean significant wasted CPU (50% of average job length on duals, and more on quads, I guess). 2. Not very nice to have different nodes running different software, and perhaps even impossible if it is an update which relates to the grid/cluster infrastructure. This would imply certain types of update probably require a full site "sync". 3. How to make sure those update/admin jobs get run exactly once on every node. Oh, I suppose cluster software must have a way of doing this, as it would be a common problem. What are people's thoughts on that? Cheers, Ian -- Ian Stokes-Rees [log in to unmask] Particle Physics, Oxford http://www-pnp.physics.ox.ac.uk/~stokes