Thanks,
I should say that we use nordugrid-arc-5.1.2-1.el6.x86_64
(Although, I think, it was occurring on an earlier version,
4.somethingorother)
It will be much work to build a compatible binary full of trace
statements, then it will be days waiting for it to happen. So, bad as it
sounds, I really hope someone else has the same problem (since misery
loves company ...)
Cheers,
Ste
On 03/07/2017 10:14 AM, Andrew Lahiff wrote:
> Hi Steve,
>
> We haven't ever experienced that problem, however we're still using the 5.0.5 RPMs.
>
> Regards,
> Andrew.
>
> ________________________________________
> From: Testbed Support for GridPP member institutes [[log in to unmask]] on behalf of Stephen Jones [[log in to unmask]]
> Sent: Tuesday, March 07, 2017 10:10 AM
> To: [log in to unmask]
> Subject: Re: ARC Brainstorming Camp Summary
>
> All,
>
> I've just realised I gave that feedback myself! So I guess the question
> is: does anyone else notice this problem. Is it just us?
>
> Ste
>
> On 03/07/2017 10:08 AM, Stephen Jones wrote:
>> Hi Andrews (Washbrook, Lahiff, ...), all
>>
>> Re: ARC Brainstorming Camp Summary
>>
>> In that talk by Andrew W, on ARC, I found this statement:
>>
>>> Sometimes a-rex or grid-manager locks up. We have to detect when the
>> gm-heartbeat file is stale, then restart by hand.
>>
>> This is also the case at Liverpool.
>>
>> Until recently, we had ~ 1000 slots, and it happened (say) every month
>> or two.
>>
>> Lately, I added some nodes that put it up to 1330 slots.
>>
>> Now it happens every couple of days.
>>
>> So I'll have to also "detect when the gm-heartbeat file is stale, then
>> restart".
>>
>> It's becoming a pest. What do people know about this problem?
>>
>> Cheers,
>>
>> Ste
>>
>> That talk:
>> https://indico.cern.ch/event/594508/attachments/1387782/2112742/ajw-ARC-131216.pdf
>>
>>
>
> --
> Steve Jones [log in to unmask]
> Grid System Administrator office: 220
> High Energy Physics Division tel (int): 43396
> Oliver Lodge Laboratory tel (ext): +44 (0)151 794 3396
> University of Liverpool http://www.liv.ac.uk/physics/hep/
--
Steve Jones [log in to unmask]
Grid System Administrator office: 220
High Energy Physics Division tel (int): 43396
Oliver Lodge Laboratory tel (ext): +44 (0)151 794 3396
University of Liverpool http://www.liv.ac.uk/physics/hep/
|