Hi,
I am trying to debug a problem that Willem van Leeuwen intermittently
sees: edg-rm cp fails with a 425 error "no route to host". It fails for
example if he tries from a WN at SARA, but from a WN at some other
places (e.g. Wuppertal) it works fine. I have verified the behavior in
test jobs.
I found the following information:
https://savannah.cern.ch/bugs/?func=detailitem&item_id=2934
and
http://goc.grid.sinica.edu.tw/gocwiki/425_425_Can't_open_data_connection%2e
that seem to indicate a firewall is the problem, on the WN side. If the
WN is firewalled (I assume a NAT system has the same effect) then there
are several alternative actions that should be taken. One of them is to
set the number of streams to 1 (one) for both large and small files.
Funny thing is, both NIKHEF and SARA have the large-file transfers set
to 3, so we (NIKHEF) should have this error too. And the SFT is not
picking it up. I think it should, and I suspect the reason is that
transfers are only done for small files, not large files.
Can we have a test put in to SFT for this behavior? And Willem, see the
savannah link for a solution: use gbf for the third-party replication,
and then edg-rm cp to get the file again. I hope this will work. The
alternative is to set the number of streams to one on the command line,
I hope this is still possible in LCG 2.3.1.
J "happy easter" T
|