Duncan Rand wrote:
> Graeme Stewart wrote:
>> On Thu, Jun 4, 2009 at 20:40, Burke, S (Stephen)
>> <[log in to unmask]> wrote:
>>> Testbed Support for GridPP member institutes
>>>> [mailto:[log in to unmask]] On Behalf Of Christopher J.Walker
>>> said:
>>>> Do grid jobs use lcg-cp (and specifically atlas jobs) use
>>>> lcg-cp to copy files to the worker node?
>>> It's a thing that jobs may do, yes ...
>>>
>>>> If so, presumably I could knock together a replacement lcg-cp script
>>>> that did
>>>>
>>>> #!/bin/pseudocodoe
>>>>
>>>> If (source is a SURL)
>>>> ln -s SURL destination
>>>>
>>>> else
>>>> call the real lcg-cp
>>> Maybe, but it sounds like a bit of a recipe for disaster - lcg-cp is
>>> quite a complex command so the parsing in your script would need to be
>>> very robust - e.g. "source is a SURL" is certainly not enough because it
>>> might point to a remote SE! Also a link isn't exactly the same as a
>>> local copy of a file.
>>
>> Likewise I really advise against this. However, there are enough
>> lustre/gpfs systems out there to now merit the development effort
>> (it's not much, for sure) to use file:/// access. But STEP09 is
>> happening and there's just no way we can get it done now. There will
>> be plenty of opportunity to stress and validate QMUL over the summer.
>>
>> I spoke to Dan about support for file:/// in ganaga (it's the ganga
>> jobs which are using rfio, BTW, because they're expecting to find a
>> DPM) and he is not aware of this being used anywhere at the moment.
>
> rfio now seems to be working as a file stager (rfcp?) for one of the
> hammer cloud tests:
>
> HammerCloud v0.2: Scheduled Test 427 Summary
> Input Type: FILE_STAGER
>
> http://atlas-ganga-storage.cern.ch/test_427/
>
> The script is picking up rfio from the list of advertised protocols:
>
> LFC_HOST: lfc.gridpp.rl.ac.uk
> resolving physical locations of replicas
> {'se_path': '/atlas/atlasmcdisk/', 'token': 'ATLASMCDISK', 'se_host':
> 'se03.esc.qmul.ac.uk', 'endpt':
> 'srm://se03.esc.qmul.ac.uk:8444/srm/managerv2?SFN=/atlas/atlasmcdisk/'}
> resolving SE protocols with default BDII: lcg-bdii.gridpp.ac.uk:2170 |
> lcg-info --list-se --vo atlas --query SE='se03.esc.qmul.ac.uk' --attr
> Protocol --sed | ['se03.esc.qmul.ac.uk%gsiftp&root&file&rfio']
> detected transfer protocols: ['rfio', 'root', 'gsiftp', 'file']
> picked transfer protocol: rfio
>
> ...
>
> Py:Athena INFO leaving with code 0: "successful run"
>
> http://atlas-ganga-storage.cern.ch/test_427/gangadir/workspace/gangarbt/LocalXML/260/1/output/stdout.gz
>
>
> Performance isn't great (cpu efficiency: 35%, 3.4 Events/s) presumably
> because it goes via se03. As Chris says we have now removed rfio from
> the list of advertised protocols.
>
> [drand@lx07 ~]$ lcg-info --list-se --vo atlas --query
> SE='se03.esc.qmul.ac.uk' --attr Protocol
> - SE: se03.esc.qmul.ac.uk
> - Protocol gsiftp
> file
>
> Let's see whether the file protocol works and if so any better.
>
It should work considerably better. But isn't Graeme saying they won't
use it for STEP09.
Or is my confusion that there are two ways it could be used:
a) Use it to copy a file to the local hard disk
b) Actually open the file on the server.
And it is the second that isn't currently supported.
I would expect both to give enormous speedups compared with copying data
via se03.
In tests on 60 of our machines I got an aggregate bandwidth close to
maxing out the 4 *10Gbit links to those machines - much better than the
1Gbit link to se03.
Chris
Chris
|