Hi Winnie,
So someone who understands this better than me might give a better answer but as I understand it some part of the Nagios probe is trying to talk to the EGI message queue hosted at mq.afroditi.hellasgrid.gr via the STOMP protocol and it's breaking and the python logger module doesn’t know how to handle the error message and how to write it to the log file so you get a panic "I don’t know how to deal with this" message rather than a nice "can’t talk to server" message.
I think, but it’s been a long time since I’ve dealt with the python logger package so I could be wrong :)
Thanks,
Gareth
On 14 Jul 2014, at 10:49, Winnie Lacesso <[log in to unmask]> wrote:
> Dr Kashif wrote:
>> If any site has an ops jobs running for long time, can you please run
>> netstat on the WN where job is running to see whether it is hanging on
>> some connection? One of the Greek sites hosting message broker had a
>> power cut on Saturday
>
> Well I didn't run netstat, but found the process tree end at python
> hung on contacting host
>
> connect(3, {sa_family=AF_INET, sin_port=htons(6163), sin_addr=inet_addr("195.251.55.91")}, 16
>
> which is mq.afroditi.hellasgrid.gr
>
> Could you be persuaded to answer what does
> No handlers could be found for logger "stomp.py"
> (in gridjob.out) mean?
>
> Winnie Lacesso / Bristol University Particle Physics Computing Systems
> HH Wills Physics Laboratory, Tyndall Avenue, Bristol, BS8 1TL, UK
|