Phil Roffe wrote:
> I ticketed the VO/user to inform him of our SE bandwidth issues and got
> a quick response.
> https://gus.fzk.de/ws/ticket_info.php?ticket=43489
>
> The user created more replicas... which did solve our initial problem,
> however it just reappeared when he next ran a set of jobs (and affected
> more sites from the look of it).
>
> As for efficiency I think we should be encouraging users to look at this
> and improve it - particularly if its because its waiting for transfers
> from a non-local SE. The scale of this user's jobs suggests he should
> be prepared to invest time in improving efficiency and file transfers.
>
I contacted the biomed VO themselves to inform them of the generally low
efficiency of their jobs which do this kind of remote copy, and they
seemed surprised that this behaviour led to such low efficiency.
I also informed them of the issues at Edinburgh, and they did apologise,
but seemed to think that it wasn't something they could fix in
themselves (that it was, ultimately, a middleware issue).
Sam
> Phil
>
>
> Davies, BGE (Brian) wrote:
>> Has anyone actually informed the user/VO that his jobs are causing
>> problems. The low efficiency is probably being caused by the fact that
>> while the job is waiting for input files it is idle. Ignoring the file
>> transfer issue, since biomed are also a smaller VO, and therefore there
>> jobs basically run when our Main V0s are not (they have low fairshare)
>> does it matter if their jobs have low efficiency?
>> Brian
>>
>> -----Original Message-----
>> From: Testbed Support for GridPP member institutes
>> [mailto:[log in to unmask]] On Behalf Of Alex Martin
>> Sent: 24 November 2008 11:04
>> To: [log in to unmask]
>> Subject: Re: Biomed data transfer volumes & usage issue
>>
>> I should add that this same user has submitted ~11K jobs to our HTC in
>> the last 2 weeks with a cpu/wall clock efficiency
>> of ~ 4.6K hours/ 146K hours =~ 3% :
>>
>> Username njob % wall user system cpu
>> biomed032 7288 20 90949 1466 224 2907
>>
>> biomed032 2741 16 56169 1310 100 1751
>>
>> cheers,
>> Alex
>>
>> On Monday 24 November 2008, Alex Martin wrote:
>>
>>> A biomed user managed to start ~1000 gridftp processes on our
>>> old SE node here last week.
>>>
>>> cheers,
>>> Alex
>>>
>>> On Monday 24 November 2008, Coles, J (Jeremy) wrote:
>>>
>>>> Dear All
>>>>
>>>>
>>>>
>>>> In the site reports for last week Durham report:
>>>>
>>>>
>>>>
>>>> "A biomed user has been transferring huge amounts of data from our
>>>> SE (>500
>>>>
>>>> requests of the same 2.8GB file to a variety of worker nodes across
>>>> Europe. Unfortunately the high bandwidth has revealed instabilities
>>>>
>>
>>>> when transferring at close to the gigabit limit. I ticketed the
>>>> user and they have distributed more replicas - but they are not
>>>> following the grid data-to-cpu model and therefore will cause severe
>>>>
>>
>>>> bandwidth issues to all sites."
>>>>
>>>>
>>>>
>>>> Has any other site seen such a seeding exercise taking place or
>>>> anything related?
>>>>
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Jeremy
>>>>
>>
>>
>> --
>> ------------------------------------------------------------------------
>> ------
>> |
>> |
>> | Dr. Alex Martin
>> |
>> | e-Mail: [log in to unmask] Queen Mary, University of
>> London, |
>> | Phone : +44-(0)20-7882-5033 Mile End Road,
>> |
>> | Fax : +44-(0)20-8981-9465 London, UK E1 4NS
>> |
>> |
>>
>> | |
>> ------------------------------------------------------------------------
>> ------
>>
>
|