Ake wrote:
>>My 30 second analysis of that suggests that 2500 node clusters are
>>probably the optimum size, probably dual or quad CPU (so 5000 or 10,000
>>CPUs), meaning a team of 5 sys admins could, in theory, provide 24x7
>>coverage of the cluster, with some overlap for multiple day-time
>
> Then you need sys admins that take care of infrastructure, user support
> and lots of other stuff.
>
> 24x7 at 5 persons = max 1 person at work at any one point in time.
I must have mis-calculated. I thought it worked out as 4 people
required to cover 365 days x 24 hours (taking into account holidays,
sick days, etc.), meaning two people would usually be available during
M-F 9-5. Probably my mistake.
> There is no way 1 person can handle a 2500 node cluster over an 8 hour
> period and get real work done at the same time.
So can 2 people? Or 3 people? How many people *are* required to
maintain a computing centre? In the grid world, aren't the large
centres going to need to have someone on-site 24 hours a day?
I imagine you are right -- there will be a constant migration of old
hardware out, new hardware in, plus software updates, plus normal
maintenance, plus diagnosis of problems and monitoring. One person
needs to be a "front man (person)" who deals with incoming support
requests (phone,email), and there will probably be a constant need for a
team of two people to be doing "mini-projects" of upgrades,
rack-reshuffling, re-wiring, etc. So a "base" of 3 "peak time"
(day-time) staff for at least 40 weeks of the year is required, plus
staff to cover "off peak" shifts to give 365x24 hour coverage. Maybe
the 24 hour coverage can be on-call "close to site" but not "on site"?
Anyway, these are all interesting questions.
Cheers,
Ian
--
Ian Stokes-Rees [log in to unmask]
Particle Physics, Oxford http://www-pnp.physics.ox.ac.uk/~stokes
|