On Mon, 10 May 2004, Steve Traylen wrote:
> True to a point but sites won't be that willing to have extremely long
> queues. I clearly have to drain nodes to install a kernel.
There needs to be some give-and-take, if VOs have jobs which really need
to take a week then sites which want to support them will have to live
with that. In theory jobs can do checkpointing in case they get killed,
although I don't know if anyone has tried it. For HEP jobs there's mostly
quite a lot of freedom to adjust times by changing the number of events
per job, except for Alice where I think it takes of the order of 12 hours
to simulate one event.
> Generally just put in your JDL what you feal you need and don't look
> at the pbs level of things. If you find you are hardly getting any resources
> in a job list match then ask sites why they won't support you.
That's true on an individual job basis, but the queue parameters should be
optimised to match the characteristics of jobs in general, e.g. there
wouldn't be much point in having a 12-hour queue everywhere if most jobs
either take less than one hour or more than 12. I'm not aware that anyone
has really tried to consider this up to now, I suspect the defaults we
have now are just what someone picked in the early days of EDG as looking
reasonable.
Stephen
|