Heya guys,
One of our users has had trouble with his transfers to Lancaster. They
seem to fail often, and successful transfers to Lancaster are much
much slower then transfers from Lancaster. Some preliminary look at
the network doesn't show any obvious problems. Two things I did notice
however are that a) The DPM headnode is under high (but not
super-heavy) load, and b) The pool the user is trying to copy onto is
currently down to 170GB of free space.
Looking in the logs show no obvious signs of failure around the
timestamp of the failed user transfer, just a lot of SRM_SUCCESSs.
The user is using lcg-cp and srmv2 but no specified space tokens, and
the failures have the form:
instglobus_xio: System error in read: Connection reset by peer
globus_xio: A system call failed: Connection reset by peer
I'm a little stumped by the lack of glaring errors in the dpm logs. My
first instinct is that this is a pool/space problem, but whilst I'm
looking into that (by giving the vo more space) I thought I'd hand
this over to you guys for your valued advice.
cheers,
Matt
|