On 8 April 2011 09:49, Christopher J. Walker <[log in to unmask]> wrote:
> On 08/04/11 09:24, [log in to unmask] wrote:
>>
>> 40s of overhead per file sounds standard to me. You will be
>> glad/disheartened
>> to here it applies to all sites. There may be a case that transfers
>> between
>> different types of SE are quicker, but needs to be investigated.
>
> Well if that is the case, eliminating that 40s there's an easy way to get s
>
>> There is a
>> plan to use a feature in FTS which allows you to overcommit parts of the
>> FTS
>> cycle.
>
> I don't think one needs overcommit, one just needs to avoid the 40s that the
> FTS spends thinking between the srm call and actually trying to start
> gridftp.
>
A lot of this is also asynchronous callbacks within SRM. (For example,
the get requests sent to both the source and destination are serviced
almost instantly by the servers there, but the callback to check that
they've serviced them actually waits for several seconds. This is, of
course, the justification for Brian/Jens's synchronous SRM stuff.)
Sam
>> Since you only have a single gridftp server, doe se this mean all
>> connections are loaded serially and hence if a file transfer takes less
>> than 40 seconds you actually end up with dead time?
>
> Looking at the numbers I quoted below, on a transfer that takes something
> like 40s, I end up with 40s of dead time (so 80s total time). There's
> presumably only a single fts channel for cms from QMUL - hence the
> serialisation (which is much easier to spot on one gridftp machine).
>
> Chris
>
>
>>
>> Brian
>>
>> -----Original Message-----
>> From: GRIDPP2: Deployment and support of SRM and local storage management
>> [mailto:[log in to unmask]] On Behalf Of Christopher J. Walker
>> Sent: 07 April 2011 22:05
>> To: [log in to unmask]
>> Subject: Small files and fts
>>
>> I've just had a look at the fts graph for QMUL.
>>
>> Previously, I thought that gridftp pipelining was the answer to speeding
>> small file transfers up. However, it looks like we may be able to get an
>> appreciable speedup even without it.
>>
>> I've attached graphs showing the transfer rate at about 30MB/s and the
>> fact that there was a small queue of files ready to transfer.
>>
>> grepping for fts in the gridftp-session.log, I get:
>>
>> grep fts gridftp-session.log.1
>>>
>>> [3593] Wed Apr 6 06:13:08 2011 :: New connection from:
>>> lcgfts01.gridpp.rl.ac.uk:43002 [3593] Wed Apr 6 06:13:40 2011 ::
>>> Closed connection from lcgfts01.gridpp.rl.ac.uk:43002 [4502] Wed Apr
>>> 6 06:14:21 2011 :: New connection from: lcgfts01.gridpp.rl.ac.uk:44839
>>> [4502] Wed Apr 6 06:14:47 2011 :: Closed connection from
>>> lcgfts01.gridpp.rl.ac.uk:44839 [4547] Wed Apr 6 06:15:29 2011 :: New
>>> connection from: lcgfts01.gridpp.rl.ac.uk:46410 [4547] Wed Apr 6
>>> 06:15:52 2011 :: Closed connection from lcgfts01.gridpp.rl.ac.uk:46410
>>> [5488] Wed Apr 6 06:16:33 2011 :: New connection from:
>>> lcgfts01.gridpp.rl.ac.uk:58170 [5488] Wed Apr 6 06:16:58 2011 ::
>>> Closed connection from lcgfts01.gridpp.rl.ac.uk:58170 [5519] Wed Apr
>>> 6 06:17:39 2011 :: New connection from: lcgfts01.gridpp.rl.ac.uk:59548
>>> [5519] Wed Apr 6 06:18:11 2011 :: Closed connection from
>>> lcgfts01.gridpp.rl.ac.uk:59548 [6408] Wed Apr 6 06:18:52 2011 :: New
>>> connection from: lcgfts01.gridpp.rl.ac.uk:60688 [6408] Wed Apr 6
>>> 06:19:16 2011 :: Closed connection from lcgfts01.gridpp.rl.ac.uk:60688
>>> [6441] Wed Apr 6 06:19:57 2011 :: New connection from:
>>> lcgfts01.gridpp.rl.ac.uk:33426 [6441] Wed Apr 6 06:20:27 2011 ::
>>> Closed connection from lcgfts01.gridpp.rl.ac.uk:33426 [7328] Wed Apr
>>> 6 06:21:08 2011 :: New connection from: lcgfts01.gridpp.rl.ac.uk:54153
>>> [7328] Wed Apr 6 06:21:34 2011 :: Closed connection from
>>> lcgfts01.gridpp.rl.ac.uk:54153 [8206] Wed Apr 6 06:22:15 2011 :: New
>>> connection from: lcgfts01.gridpp.rl.ac.uk:56067 [8206] Wed Apr 6
>>> 06:22:41 2011 :: Closed connection from lcgfts01.gridpp.rl.ac.uk:56067
>>> [8237] Wed Apr 6 06:23:22 2011 :: New connection from:
>>> lcgfts01.gridpp.rl.ac.uk:57762 [8237] Wed Apr 6 06:23:52 2011 ::
>>> Closed connection from lcgfts01.gridpp.rl.ac.uk:57762 [9103] Wed Apr
>>> 6 06:24:31 2011 :: New connection from: lcgfts01.gridpp.rl.ac.uk:59344
>>> [9103] Wed Apr 6 06:25:09 2011 :: Closed connection from
>>> lcgfts01.gridpp.rl.ac.uk:59344 [9132] Wed Apr 6 06:25:49 2011 :: New
>>> connection from: lcgfts01.gridpp.rl.ac.uk:60820 [9132] Wed Apr 6
>>> 06:26:14 2011 :: Closed connection from lcgfts01.gridpp.rl.ac.uk:60820
>>> [10082] Wed Apr 6 06:26:55 2011 :: New connection from:
>>> lcgfts01.gridpp.rl.ac.uk:55347 [10082] Wed Apr 6 06:27:17 2011 ::
>>> Closed connection from lcgfts01.gridpp.rl.ac.uk:55347 [10107] Wed Apr
>>> 6 06:28:00 2011 :: New connection from: lcgfts01.gridpp.rl.ac.uk:56105
>>> [10107] Wed Apr 6 06:28:24 2011 :: Closed connection from
>>> lcgfts01.gridpp.rl.ac.uk:56105 [10998] Wed Apr 6 06:29:06 2011 :: New
>>> connection from: lcgfts01.gridpp.rl.ac.uk:56761 [10998] Wed Apr 6
>>> 06:29:31 2011 :: Closed connection from lcgfts01.gridpp.rl.ac.uk:56761
>>> [11026] Wed Apr 6 06:30:12 2011 :: New connection from:
>>> lcgfts01.gridpp.rl.ac.uk:58265 [11026] Wed Apr 6 06:30:37 2011 ::
>>> Closed connection from lcgfts01.gridpp.rl.ac.uk:58265 [11947] Wed Apr
>>> 6 06:31:18 2011 :: New connection from: lcgfts01.gridpp.rl.ac.uk:57632
>>> [11947] Wed Apr 6 06:32:20 2011 :: Closed connection from
>>> lcgfts01.gridpp.rl.ac.uk:57632 [12828] Wed Apr 6 06:33:01 2011 :: New
>>> connection from: lcgfts01.gridpp.rl.ac.uk:60042 [12828] Wed Apr 6
>>> 06:33:24 2011 :: Closed connection from lcgfts01.gridpp.rl.ac.uk:60042
>>> [12860] Wed Apr 6 06:34:07 2011 :: New connection from:
>>> lcgfts01.gridpp.rl.ac.uk:33401 [12860] Wed Apr 6 06:34:32 2011 ::
>>> Closed connection from lcgfts01.gridpp.rl.ac.uk:33401 [13730] Wed Apr
>>> 6 06:35:12 2011 :: New connection from: lcgfts01.gridpp.rl.ac.uk:34569
>>> [13730] Wed Apr 6 06:36:17 2011 :: Closed connection from
>>> lcgfts01.gridpp.rl.ac.uk:34569 [14607] Wed Apr 6 06:36:57 2011 :: New
>>> connection from: lcgfts01.gridpp.rl.ac.uk:34013 [14607] Wed Apr 6
>>> 06:37:24 2011 :: Closed connection from lcgfts01.gridpp.rl.ac.uk:34013
>>> [14644] Wed Apr 6 06:38:07 2011 :: New connection from:
>>> lcgfts01.gridpp.rl.ac.uk:35836 [14644] Wed Apr 6 06:38:32 2011 ::
>>> Closed connection from lcgfts01.gridpp.rl.ac.uk:35836 [15519] Wed Apr
>>> 6 06:39:18 2011 :: New connection from: lcgfts01.gridpp.rl.ac.uk:37533
>>> [15519] Wed Apr 6 06:39:43 2011 :: Closed connection from
>>> lcgfts01.gridpp.rl.ac.uk:37533 [16245] Wed Apr 6 06:40:25 2011 :: New
>>> connection from: lcgfts01.gridpp.rl.ac.uk:38635 [16245] Wed Apr 6
>>> 06:40:50 2011 :: Closed connection from lcgfts01.gridpp.rl.ac.uk:38635
>>> [16417] Wed Apr 6 06:41:34 2011 :: New connection from:
>>> lcgfts01.gridpp.rl.ac.uk:39256 [16417] Wed Apr 6 06:41:59 2011 ::
>>> Closed connection from lcgfts01.gridpp.rl.ac.uk:39256 [17289] Wed Apr
>>> 6 06:42:41 2011 :: New connection from: lcgfts01.gridpp.rl.ac.uk:39976
>>> [17289] Wed Apr 6 06:43:05 2011 :: Closed connection from
>>> lcgfts01.gridpp.rl.ac.uk:39976 [17312] Wed Apr 6 06:43:45 2011 :: New
>>> connection from: lcgfts01.gridpp.rl.ac.uk:40657 [17312] Wed Apr 6
>>> 06:44:09 2011 :: Closed connection from lcgfts01.gridpp.rl.ac.uk:40657
>>> [18181] Wed Apr 6 06:44:51 2011 :: New connection from:
>>> lcgfts01.gridpp.rl.ac.uk:41690 [18181] Wed Apr 6 06:45:13 2011 ::
>>> Closed connection from lcgfts01.gridpp.rl.ac.uk:41690 [18211] Wed Apr
>>> 6 06:45:58 2011 :: New connection from: lcgfts01.gridpp.rl.ac.uk:43328
>>> [18211] Wed Apr 6 06:46:26 2011 :: Closed connection from
>>> lcgfts01.gridpp.rl.ac.uk:43328 [19094] Wed Apr 6 06:47:09 2011 :: New
>>> connection from: lcgfts01.gridpp.rl.ac.uk:43963 [19094] Wed Apr 6
>>> 06:47:45 2011 :: Closed connection from lcgfts01.gridpp.rl.ac.uk:43963
>>> [19117] Wed Apr 6 06:48:26 2011 :: New connection from:
>>> lcgfts01.gridpp.rl.ac.uk:46069 [19117] Wed Apr 6 06:48:52 2011 ::
>>> Closed connection from lcgfts01.gridpp.rl.ac.uk:46069 [19991] Wed Apr
>>> 6 06:49:34 2011 :: New connection from: lcgfts01.gridpp.rl.ac.uk:47666
>>> [19991] Wed Apr 6 06:49:56 2011 :: Closed connection from
>>> lcgfts01.gridpp.rl.ac.uk:47666 [20885] Wed Apr 6 06:50:38 2011 :: New
>>> connection from: lcgfts01.gridpp.rl.ac.uk:49219 [20885] Wed Apr 6
>>> 06:51:04 2011 :: Closed connection from lcgfts01.gridpp.rl.ac.uk:49219
>>> [20912] Wed Apr 6 06:51:48 2011 :: New connection from:
>>> lcgfts01.gridpp.rl.ac.uk:39175 [20912] Wed Apr 6 06:52:10 2011 ::
>>> Closed connection from lcgfts01.gridpp.rl.ac.uk:39175 [21775] Wed Apr
>>> 6 06:52:53 2011 :: New connection from: lcgfts01.gridpp.rl.ac.uk:40474
>>> [21775] Wed Apr 6 06:53:18 2011 :: Closed connection from
>>> lcgfts01.gridpp.rl.ac.uk:40474 [21813] Wed Apr 6 06:54:01 2011 :: New
>>> connection from: lcgfts01.gridpp.rl.ac.uk:41845 [21813] Wed Apr 6
>>> 06:54:24 2011 :: Closed connection from lcgfts01.gridpp.rl.ac.uk:41845
>>
>> So there is a huge delay - about 40s between closing one connection and
>> starting the next.
>>
>> Looking at one file in particular, I see from the srm logs (and omitting
>> the details of the exact file transferred):
>>
>> 06:13:45.202 - INFO [XML-RPC-20] - srmLs:
>> 06:13:45.202 - INFO [XML-RPC-20] - srmLs:
>> 06:13:45.550 - INFO [pool-2-thread-7] - Handling PtG chunk for user DN:
>> 06:13:45.626 - INFO [pool-2-thread-7] - Finished handling PtG chunk for
>> user DN:
>> 06:14:47.270 - INFO [XML-RPC-6] - srmReleaseFiles:
>> 06:14:47.270 - INFO [XML-RPC-6] - srmReleaseFiles:
>>
>>
>> And from the gridftp-session.log for the same file, I see:
>>
>> [4502] Wed Apr 6 06:14:21 2011 :: New connection from:
>> lcgfts01.gridpp.rl.ac.uk:44839
>> [4502] Wed Apr 6 06:14:22 2011 :: Starting to transfer
>> [4502] Wed Apr 6 06:14:46 2011 :: Finished transferring
>>
>>
>> So there's a delay of 30s between doing the PtG and actually opening the
>> gridftp connection.
>>
>> Do other sites see this?
>>
>> Is this apparent in the fts logs?
>>
>> Is there anything that can be done to avoid this delay in FTS?
>>
>> Chris
>
|