Print

Print


On 18 May 2007, at 10:42, Gordon, JC (John) wrote:
> Note it is not the full SAM tests that are being run in the job  
> wrapper,
> just a subset. The aim being that these tests are run as every VO so
> they find instances where something works as ops but not as atlas.

But surely that's why we have VO specific SAM tests? It's madness to  
test things in every job wrapper!

>
> When we raised this issue originally Piotr published details via the
> weekly operations meeting of how to switch off the tests until the
> problem was addressed. It wasn't planned that these test took 15 mins
> and my recollection is that this was fixed within days.

Only by disabling it - which everyone still has to do by hand.

Do current versions of YAIM still set these tests up?


On 18 May 2007, at 10:44, Gordon, JC (John) wrote:

> I'm planning to have monitoring of all sorts as a major item at June's
> GDB so one of you should be prepared to raise the issue of wrapper
> tests.

Anything that adds more than 10s to a job's wallclock time will be  
unacceptable.

Cheers

Graeme



>
> John
>
>> -----Original Message-----
>> From: Testbed Support for GridPP member institutes
>> [mailto:[log in to unmask]] On Behalf Of Graeme Stewart
>> Sent: 18 May 2007 10:02
>> To: [log in to unmask]
>> Subject: Re: SAME tests run in job wrapper
>>
>> Hi Steve
>>
>> The new WAR file has improved the CLOSE_WAIT a bit, but it's
>> not entirely gone:
>>
>> svr019:~# netstat -t | grep rgma | wc -l
>>      168
>> svr019:~# netstat -t | grep rgma | grep CLOSE_WAIT | wc -l
>>       39
>>
>> And in addition, the problem was monitoring in the job
>> wrapper adding unnecessary wallclock to jobs. I don't see how
>> this will be dramatically improved, even if CLOSE_WAIT goes
>> away entirely.
>>
>> I hope this is not being re-introduced!
>>
>> Cheers
>>
>> Graeme
>>
>> On 18 May 2007, at 09:52, Fisher, SM (Steve) wrote:
>>
>>> Stephen,
>>>
>>> Please note that patch 1144 is on the PPS and will be
>> available on the
>>> PS very shortly. It is already deployed on the CERN mox box
>> and solves
>>> the problem.
>>>
>>> Steve
>>>
>>>> -----Original Message-----
>>>> From: Testbed Support for GridPP member institutes
>>>> [mailto:[log in to unmask]] On Behalf Of Stephen Childs
>>>> Sent: 18 May 2007 09:40
>>>> To: [log in to unmask]
>>>> Subject: Re: SAME tests run in job wrapper
>>>>
>>>> Graeme Stewart wrote:
>>>>> Yes, I've noticed this at Glasgow - and in particular R-GMA
>>>> is adding
>>>>> 5-15 minutes of wallclock time to every job, which is a
>>>> terrible waste
>>>>> of resources (particularly for our GRAM GT2 user groups,
>>>> who run some
>>>>> pretty short jobs).
>>>>>
>>>>> Something I really need to look into...
>>>>
>>>> I've just remembered about this (as I'm trying to do simple tests
>>>> using globus-job-run which aren't simple any more).
>> Graeme, did you
>>>> ever find out where this was being done in order to disable it?
>>>>
>>>> Jeremy, could you comment on when this was introduced and whether
>>>> there was input from sites on it?
>>>>
>>>> Stephen
>>>>
>>>> --
>>>> Dr. Stephen Childs,
>>>> Research Fellow, EGEE Project,    phone:
>>>> +353-1-8961797
>>>> Computer Architecture Group,      email:
>>>> Stephen.Childs @ cs.tcd.ie
>>>> Trinity College Dublin, Ireland   web:
>>>> http://www.cs.tcd.ie/Stephen.Childs
>>>>
>>
>> --
>> Dr Graeme Stewart - http://wiki.gridpp.ac.uk/wiki/User:Graeme_stewart
>> ScotGrid - http://www.scotgrid.ac.uk/ http://scotgrid.blogspot.com/
>>

--
Dr Graeme Stewart - http://wiki.gridpp.ac.uk/wiki/User:Graeme_stewart
ScotGrid - http://www.scotgrid.ac.uk/ http://scotgrid.blogspot.com/