Testbed Support for GridPP member institutes
> [mailto:[log in to unmask]] On Behalf Of Ian Stokes-Rees said:
> I understood David Groep (sp?) from NIKEF wrote a caching replacement
> for this last year some time.
He also subsequently withdrew it, I think because it had problems with a
later version of torque. There has been quite a lot of effort recently
to improve the bdii performance, including a request to move it away
from the CE head node.
> In any case, I feel a bit like my original question was unanswered:
> isn't it *necessary* for a CE to be able to support 4000 simultaneous
> jobs (whether running or queued)? How do the large computing centres
> such as CERN, Bologna, FZK, and RAL handle this?
As Steve T says, the problem is that you get a job manager per user *per
RB*, so if a single user submits jobs through many RBs, as lhcb
sometimes do, you can get problems. Obviously that may also be true if
you have large numbers of concurrent users, but so far we haven't really
hit that!
In the medium term we're moving away from the current CE architecture
anyway, see e.g.
http://grid.pd.infn.it/cream/field.php
Stephen
|