> -----Original Message-----
> From: Testbed Support for GridPP member institutes [mailto:TB-
>
> I just looked and the jobs are very poor in CPU efficiency (15-25%).
> Yes, the jobs were reading directly using rfio.
>
If someone could give us a quick idiots' guide to the contents of the
ganga robot status page I think that might help - some of the graphs
aren't as self explanatory as others. In particular, what are the
axes on the 'CPU/Walltime' plots showing, and what are the slices on
the 'Site Efficiency' pies?
> Although the DPM servers were crusing - low load, excellent data
> output rates, the headnode was suffering very high CPU load. This is
> surprising as the headnode should only be contacted for the open step
> and it hands off to the disk server.
>
> Puzzling...
>
We're also seeing the srmv1 server getting a lot of load on the DPM head
node; 'top' is showing it hitting up to 150% CPU (I'm guessing it's
threaded?) but the machine's not getting into wait states at all - it's
either running or idle.
On one worker node that I was watching earlier that had eight of these
jobs
and nothing else running, it was showing a load of ~5-6, so somewhat but
not hopelessly inefficient. At that point 'iftop' showed it pulling in a
fairly steady 400Mbit/s of data from across the DPM system, over a
single
gigabit ethernet connection.
We'll be interested to see how these tests run on our soon to be
commissioned
new kit since it should have better networking for this sort of
activity.
Ewan
|