Hi Martin,

I didn't want to spam everyone with an strace so I have sent them in a direct email.
The top level BDII is RAL in the UK:lcg-bdii.gridpp.ac.uk

Cheers,

Dug

On 22 March 2010 21:51, <[log in to unmask]> wrote:
Hi Dug,

> We have had some issues with file transfer from our WN back to remote SE's. So I have been
> carrying out some testing to try and replicate the issue.
> I am now seeing strange results from local and remote grid sites with various SE srm
> implementations:
>
> The test is a simple simultaneous lcg-cp of a 10M file for varying amounts of WN's .
>
> Summary:
> lcg-cp's from Glasgow to DPM sites local and remote are 100% successful.
> lcg-cp's from Glasgow to CASTOR/STORM sites are 50% successful.
> lcg-cp's from Glasgow to DCACHE sites are 20-50% successful.
>
> Our WN's have lcg_util-1.7.6-1.sl5 installed.
>
> We initially thought this was network related but now I am not so sure and think it definitely
> has something to do with the srm it is using.  Does anyone know of any current issues with
> lcg-cp and the varying srm implementations or have you seen anything similar?

Can you repeat the tests like this:

   strace -o /tmp/trace-$i lcg-cp .....

where $i should be unique per file.  Then we will be able to see which
connection attempts led to timeouts.

Which top BDII are you using?



--
ScotGrid, Room 481, Kelvin Building, University of Glasgow
tel: +44(0)141 330 6439