Maarten,
I think you would need one user id per queued job rather than per job
slot. This is because you cannot predict the order the jobs will run
but the user needs to be decided at the time of submission from the CE.
CERN's current figures are 2,500 running jobs with 19,000 queued.
Given LHC is expected to lead to at least a growth of a factor of 3 in
these figures, we can assume 7,500 and 57,000 respectively.
The issue for the system administrators is that many programs and
libraries do not scale well with large password files. Basic system
calls such as getpwnam need to scan the contents of /etc/passwd to find
an entry. This is O(n) with the number of users.
As an example, we recently were working through with Platform Computing
who write LSF to optimise one of their programs for our password file
(which is currently at 15,000 users). One of the operations was taking
8 minutes since it was doing an O(n**2) lookup on the users. They had
not seen this at other sites so I assume CERN is already at the top end
of the scale. Re-writing the code produced a time of 10 seconds but it
was two weeks work.
Tim
Maarten Litmaath, CERN wrote:
>Dear site admins,
>up to now LCG-2 releases have suggested that 50 pool accounts
>be created per VO. Recently a CMS RB at CERN has reached 50
>concurrent users, so we added 50 more accounts. In the coming
>months the usage for the LHC experiments is expected to rise
>quite a bit, so we would have to have, say, a few hundred
>accounts per VO.
>
>At the same time we are discussing a different pool account
>scheme for a future version of gLite, viz. leasing an account
>per job instead of per user. In that case we would need at
>least as many accounts as there are job slots on the farm.
>To avoid one VO taking all the accounts, the easiest would
>be to give each VO a complete set. We then would also end up
>with many hundreds of pool accounts, and thousands on sites
>that have large farms and/or support many VOs.
>
>What we would like to know is how far we can take these schemes?
>
>Say we propose that for the next LCG-2 release sites should
>configure 1000 pool acounts per VO: would any of you object,
>and why? Better ideas are welcome too.
>Thanks,
> Maarten
>
>
|