Well, in the last official cream-sge version I have to restart the
blah parser every 8 h. Though this is fixed in the next one, if it
ever sees the light of day ....
In the previous (next to last release) I have to restart tomcat every
1-2 weeks, but I think that's a known tomcat issue, not cream.
Cheers,
Daniela
On 22 September 2011 07:49, John Gordon <[log in to unmask]> wrote:
> Marvin, the situation in other countries may be irrelevant but if it is similar it makes escalating this a lot easier. If it is just the UK then there will be pressure on us (you:-) ) to help debug.
>
> John
>
> Just because you are paranoid, it doesn't mean, etc.
>
> -----Original Message-----
> From: Alessandra Forti [mailto:[log in to unmask]]
> Sent: 21 September 2011 17:55
> To: Testbed Support for GridPP member institutes
> Cc: Gordon, John (STFC,RAL,ESC)
> Subject: Re: CREAM problems (not just SGE)
>
> How about: most UK sites have to restart the cream services with a
> frequency that goes from once a week to once every month/quarter (?)
> depending on the size of the cluster and its load. Nobody wants to debug
> it because it's not easy to pinpoint when exactly the services start
> hanging and because it requires an intimate knowledge with the inner
> workings of cream that nobody has (or wants to have).
>
> Whether other countries see the same should be irrelevant. Let's start
> to count how many sites in the UK have this problem more than 5/10/15?
>
> And this setting aside my Marvin [1] like attitude towards tickets for
> EGI services.
>
> cheers
> alessandra
>
> [1] http://en.wikipedia.org/wiki/Marvin_the_Paranoid_Android
>
> On 21/09/2011 16:25, John Gordon wrote:
>> So how should we articulate the feeling that something is a bit iffy. I think it is worth doing just to see if others feel the same. Either everyone else says it seems fine for us in which we shut up and stop whining OR everyone else says 'yes we were just thinking the same' in which case WLCG (or EGI) could escalate the issue.
>>
>> John
>>
>> -----Original Message-----
>> From: Testbed Support for GridPP member institutes [mailto:[log in to unmask]] On Behalf Of Ewan MacMahon
>> Sent: 21 September 2011 14:10
>> To: [log in to unmask]
>> Subject: Re: CREAM problems (not just SGE)
>>
>>> -----Original Message-----
>>> From: Testbed Support for GridPP member institutes [mailto:TB-
>>> [log in to unmask]] On Behalf Of Matt Doidge
>>>
>>> My gut feeling is that
>>> the batch/blah/cream communication isn't quite up to scratch.
>>>
>> That's generally our impression too, but that's really all
>> it is. We certainly can submit tickets complaining that it's
>> a bit crap, feels dodgy and has a distinct tendency to go
>> squiffy on occasion, but I'm not sure that's useful.
>>
>> Tickets and bug reports and the like work well for the case
>> of a basically decent thing with some specific problems. It
>> just doesn't suit this sort of case where something appears
>> to just be of generally poor quality.
>>
>> I think Stuart Purdie probably has the most intimate awareness
>> of the varied ways in which CREAM/blah seem to be bad and why,
>> so I suspect if anyone's going to be able to write coherent
>> tickets it's him.
>>
>> Ewan
>
--
-----------------------------------------------------------
[log in to unmask]
HEP Group/Physics Dep
Imperial College
Tel: +44-(0)20-75947810
http://www.hep.ph.ic.ac.uk/~dbauer/
|