Hi Kostas,
Quoting Kostas Georgiou <[log in to unmask]>:
> A google search for "dpm readahead2" returns a CHEP'07 paper saying:
> The default RFIO buffer size is 128kB, but this can be configured by
> setting RFIO IOBUFSIZE in /etc/shift.conf on the client.
> It's not clear (and I know next to nothing about dpm) if it is enabled
> by default or not though.
Eh, that will be my CHEP07 paper then... I forgot that I had those
instructions in there. I'm certain that read ahead is not used by
default, but I am checking with the developers to make sure.
> The big question is how the atlas jobs are accessing the data, if it's
> not a serial IO pattern a bigger readahead buffer is only going to load
> your network/disk servers with no gain as you can imagine.
The analysis access to disks will be random. The ATLAS (and LHCb) data
is stored in ROOT-based files, just like CMS. This means that although
the application processes physics events in the file sequentially, the
format that ROOT uses means that you end up jumping about the file to
get all of the information about each event. This is just as is
reported in the link that you sent round earlier.
As you say in this case using read ahead doesn't buy you anything, but
this should be OK for DPM sites as it isn't used by default (I think).
> We are already overriding the dCache readahead for some time now for CMS
> which are more or less our only clients that use dcap heavily.
You guys use gsidcap or dcap? Using plain dcap has benefits of not
needing all of the expensive GSI stuff which loads the servers, but
obviously has it's disadvantages as well.
Greig
--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
|