On Wed, Jun 24, 2009 at 19:40, Christopher
J.Walker<[log in to unmask]> wrote:
> Rob Fay wrote:
>>
>> Out of interest, what's the limiting factor(s) on the number of these jobs
>> running simultaneously at a particular site?
Pilot submission - contingent on them having jobs to pickup.
I'll toss some into ANALY_LIV and see if I get get the rates up for you.
>>
>> The Liverpool cluster has only been around half full over the last day,
>> and nothing is bottlenecked at this end (bandwidth, etc.), so I'm guessing
>> there's another limit set somewhere?
>
> I was slightly worried about the fact that our cluster wasn't (and indeed
> still isn't) full.
>
> Dan (ccd) increased the number of jobs running and at 17:39, we peaked at
> 510 jobs running, though it has now dropped back to 197.
>
> The load on ce01, our old CE, went up to around 18 - could that be causing a
> problem getting jobs into the cluster?
Well, as discussed on the other thread I am throwing a lot of pilots
at ce01 and ce03 now.
>
> If that is the case, it may be worth trying sending some jobs via our new
> CE, ce03. We did have problems with it though, and while it is now passing
> SAM tests, it is perhaps better not to mix up two tests at the same time.
>
> Chris
>
> PS We seem to have overtaken Glasgow :-).
Aggghh, damn you! Of course, we ran out of jobs so we need more work...
What's more interesting, of course, is the analysis throughput we each
get. I want to know if you get substantially better throughput than us
with your whizzy lustre vs. our humble rfcp to local disk.
Cheers
Graeme
--
Dr Graeme Stewart http://www.physics.gla.ac.uk/~graeme/
Department of Physics and Astronomy, University of Glasgow, Scotland
DEATH TO MEETINGS!
|