Hi,
I had to face the same problem. For stagein, I solved it with the
attached script, and seems to be working. You can just replace the
"python dqlcg.py" command with the one you want to monitor.
Note that I don't need to estimate the time needed for the transfer,
which can depend on several uncontrolled variables. I monitor the
incoming file and kill the transfer command if it doens't grow for a
given amount of seconds.
Cheers,
David
david bouvet wrote:
> Hi Ahmed,
>
> Ahmed Beriache wrote:
>
>> Hi all.
>>
>> We have some jobs running at CGG-LCG2 from the VO biomed. Each one must
>> download about 300 MBytes of data from SEs then runs during few minutes
>> (between 5 and 10 minutes). With several simultanious downloads we
>> obtain a very low data transfer rate and the CPU does'nt do any thing
>> for a long time (more than 20 hours). I noticed today on Worker Nodes
>> that the process lcg-cp is running but the size of downloaded files is
>> not growing at all.
>> Does this mean that the lcg-cp process is blocked ?
>
> Yes you're right. At IN2P3-CC we have encounter the same thing. After
> asking the user, the job have been canceled by the user or by us.
>
>> Is there a time out for lcg-cp after what it stops downloading data from
>> the grid ?
>
> No, but the user can put a time out on that command for the stage-in
> period of files, as he knows the size of these stage-in files.
> For the stage-out, it can be more complicated.
>
>> Do you think it is a good thing to allocate the CPU for a job even if
>> the data will be in the node 20 hours later ?
>
> Not really.
>
> Cheers,
> David.
>
>>
>> Thanks in advance for your help
>>
>> Regards
>>
>> --
>> -----------------------------------------------------------------------
>> Ahmed Beriache phone: +33 01 64 47 35 18
>> (direct)
>> Compagnie Generale de Geophysique (CGG) +33 01 64 47 30 00
>> 1, rue Leon Migaux 91341 Massy fax: +33 01 64 47 30 98
>> web site: http://www.cgg.com e-mail: [log in to unmask]
>> -----------------------------------------------------------------------
>>
>
> --
> *David BOUVET*
> /Applications Support Coordinator - EGEE Project team/
> IN2P3/CNRS Computing Centre - Lyon (FRANCE)
> http://grid.in2p3.fr <http://grid.in2p3.fr/>
> Tel. : +33 4 72 69 41 62 | Fax. : +33 4 72 69 41 70 | e-mail :
> [log in to unmask] <mailto:[log in to unmask]>
>
--
David Rebatto
I.N.F.N. - Sezione di Milano
Via Celoria, 16 - 20133 Milano ITALY
tel: +39 02503.17623 e-mail: [log in to unmask]
URL: http://www.mi.infn.it/~rebatto
"There are 10 kinds of people in the world:
those who understand binary and those who don't..."
|