Print

Print


Hi Dimitri,
the error message would make think they have at least some sl6 WNs, 
because looking at the history or notifications of the service

https://ngi-de-nagios.gridka.de/nagios/cgi-bin/status.cgi?host=vgn003.hep.physik.uni-siegen.de

it occurrs from time to time the problem "WARNING: job submission OK - 
problem on WN [Done (Exit Code !=0)]" with the probe 
"org.sam.CREAMCE-JobSubmit 
<https://ngi-de-nagios.gridka.de/nagios/cgi-bin/extinfo.cgi?type=2&host=vgn003.hep.physik.uni-siegen.de&service=org.sam.CREAMCE-JobSubmit-%2Fops%2FNGI%2FGermany>"

cheers,
Alessandro

Il 17/10/2012 16:28, Dimitri Nilsen ha scritto:
> No, sl5.
>
> that is not our WNs, but of uni-siegen
>
> On 10/17/2012 03:21 PM, Alessandro Paolini wrote:
>> Hi Dimitri,
>> your WNs are on sl6, aren't they?
>> in past weeks some problems were noticed between Nagios and sl6 WNs on
>> some sitezs, look for example this ticket:
>> https://ggus.eu/ws/ticket_info.php?ticket=86451
>>
>> cheers,
>> Alessandro
>>
>> Il 12/10/2012 16:21, Dimitri Nilsen ha scritto:
>>> Hi,
>>>
>>> we have a problem with a CREAMCE, EMI2, that tests submitting by
>>> nagios failing with an error (Exit Code !=0)
>>> Looking into logs we see:
>>> Message Broker destination:
>>> /queue/grid.probe.metricOutput.EGEE.ngi-de-nagios_gridka_de
>>> 2012-10-12 16:04:52,643 ConnStomp INFO slept in message dispatch loop.
>>> ./nagrun.sh: line 423: 2234 Segmentation fault $NAGROOT/bin/nagios
>>> $NAGCONF>  $FNAGPID 2>&1
>>>
>>> Any ideas what could be the problem? The submission of simple
>>> "/bin/hostname" jobs works fine.
>>>
>>> Regards
>>> Dimitri
>>>
>>>
>>> ************************
>>>
>>> Complete job output here:
>>>
>>> ----------------
>>>
>>>
>>> WARNING: job submission OK - problem on WN [Done (Exit Code !=0)]
>>> WARNING: job submission OK - problem on WN [Done (Exit Code !=0)]
>>>
>>> Testing from: ngi-de-nagios.gridka.de
>>> DN: /C=DE/O=GermanGrid/OU=KIT/CN=Robot: grid client - Dimitri
>>> Nilsen/CN=proxy
>>> VOMS FQANs: /ops/NGI/Germany/Role=NULL/Capability=NULL,
>>> /ops/NGI/Role=NULL/Capability=NULL, /ops/Role=NULL/Capability=NULL,
>>> /ops/NGI/Switzerland/Role=NULL/Capability=NULL
>>> glite-wms-job-status
>>> https://ngi-de-lbmon-1.scc.kit.edu:9000/QDRXbxwleZroxrGOgu4Ptg
>>>
>>>
>>> ======================= glite-wms-job-status Success
>>> =====================
>>> BOOKKEEPING INFORMATION:
>>>
>>> Status info for the Job :
>>> https://ngi-de-lbmon-1.scc.kit.edu:9000/QDRXbxwleZroxrGOgu4Ptg
>>> Current Status: Done (Exit Code !=0)
>>> Exit code: 1
>>> Status Reason: Job Terminated Successfully
>>> Destination: vgn003.hep.physik.uni-siegen.de:8443/cream-pbs-ops
>>> Submitted: Fri Oct 12 16:04:37 2012 CEST
>>> ========================================================================== 
>>>
>>>
>>> Getting job output: OK.
>>>
>>>
>>> Job output:
>>>
>>> Launched with parameters: -v ops -f /ops/NGI/Germany -d
>>> /queue/grid.probe.metricOutput.EGEE.ngi-de-nagios_gridka_de -n PROD -t
>>> 600 -w 1 -l prod-lfc-shared-central.cern.ch -s
>>> dcache.fz-juelich.de,ophelia.zih.tu-dresden.de,se.bfg.uni-freiburg.de
>>> === [Fri Oct 12 16:04:48 CEST 2012] ===
>>> === Running on ===
>>> === Site: UNI-SIEGEN-HEP
>>> === CE: vgn003.hep.physik.uni-siegen.de:8443/cream-pbs-ops
>>> === WN: vgn020.hep.physik.uni-siegen.de
>>> === WN arch: x86_64.
>>> === [Fri Oct 12 16:04:48 CEST 2012] ===
>>> === Running on ===
>>> === Site: UNI-SIEGEN-HEP
>>> === CE: vgn003.hep.physik.uni-siegen.de:8443/cream-pbs-ops
>>> === WN: vgn020.hep.physik.uni-siegen.de
>>> === WN arch: x86_64
>>> Check Python version:
>>> /usr/bin/python
>>> Python 2.6.6
>>> Can we import Python LDAP ...
>>> YES.
>>> Launching MTA.
>>> /home/ops03/home_cream_296803620/CREAM296803620/nagios/bin/mta-simple
>>> --dirq /tmp/sam.2198.687/msg-outgoing --destination
>>> /queue/grid.probe.metricOutput.EGEE.ngi-de-nagios_gridka_de
>>> --broker-network PROD --pidfiledir
>>> /home/ops03/home_cream_296803620/CREAM296803620/nagios/var/ -v info
>>> --bdii-uri bdii-fzk.gridka.de:2170
>>> Setting Nagios configuration.
>>>
>>> Nagios Core 3.3.1
>>> Copyright (c) 2009-2011 Nagios Core Development Team and Community
>>> Contributors
>>> Copyright (c) 1999-2009 Ethan Galstad
>>> Last Modified: 07-25-2011
>>> License: GPL
>>>
>>> Website: http://www.nagios.org
>>> Reading configuration data...
>>> Read main config file okay...
>>> Processing object config directory
>>> '/home/ops03/home_cream_296803620/CREAM296803620/nagios/etc/common.d'... 
>>>
>>> Processing object config file
>>> '/home/ops03/home_cream_296803620/CREAM296803620/nagios/etc/common.d/commands.cfg'... 
>>>
>>> Processing object config file
>>> '/home/ops03/home_cream_296803620/CREAM296803620/nagios/etc/common.d/host.cfg'... 
>>>
>>> Processing object config file
>>> '/home/ops03/home_cream_296803620/CREAM296803620/nagios/etc/common.d/timeperiods.cfg'... 
>>>
>>> Processing object config file
>>> '/home/ops03/home_cream_296803620/CREAM296803620/nagios/etc/common.d/contacts.cfg'... 
>>>
>>> Processing object config directory
>>> '/home/ops03/home_cream_296803620/CREAM296803620/nagios/etc/wn.d'...
>>> Processing object config directory
>>> '/home/ops03/home_cream_296803620/CREAM296803620/nagios/etc/wn.d/cadist'... 
>>>
>>> Processing object config file
>>> '/home/ops03/home_cream_296803620/CREAM296803620/nagios/etc/wn.d/cadist/commands.cfg'... 
>>>
>>> Processing object config file
>>> '/home/ops03/home_cream_296803620/CREAM296803620/nagios/etc/wn.d/cadist/services.cfg'... 
>>>
>>> Processing object config directory
>>> '/home/ops03/home_cream_296803620/CREAM296803620/nagios/etc/wn.d/org.sam'... 
>>>
>>> Processing object config file
>>> '/home/ops03/home_cream_296803620/CREAM296803620/nagios/etc/wn.d/org.sam/commands.cfg'... 
>>>
>>> Processing object config file
>>> '/home/ops03/home_cream_296803620/CREAM296803620/nagios/etc/wn.d/org.sam/services.cfg'... 
>>>
>>> Read object config files okay...
>>>
>>> Running pre-flight check on configuration data...
>>>
>>> Checking services...
>>> Checked 11 services.
>>> Checking hosts...
>>> Checked 1 hosts.
>>> Checking host groups...
>>> Checked 0 host groups.
>>> Checking service groups...
>>> Checked 0 service groups.
>>> Checking contacts...
>>> Checked 1 contacts.
>>> Checking contact groups...
>>> Checked 1 contact groups.
>>> Checking service escalations...
>>> Checked 0 service escalations.
>>> Checking service dependencies...
>>> Checked 0 service dependencies.
>>> Checking host escalations...
>>> Checked 0 host escalations.
>>> Checking host dependencies...
>>> Checked 0 host dependencies.
>>> Checking commands...
>>> Checked 9 commands.
>>> Checking time periods...
>>> Checked 1 time periods.
>>> Checking for circular paths between hosts...
>>> Checking for circular host and service dependencies...
>>> Checking global event handlers...
>>> Checking obsessive compulsive processor commands...
>>> Checking misc settings...
>>>
>>> Total Warnings: 0
>>> Total Errors: 0
>>>
>>> Things look okay - No serious problems were detected during the
>>> pre-flight check
>>> Launching Nagios: Fri Oct 12 16:04:51 CEST 2012
>>> Launch Nagios as background process.
>>> Message Broker URI: stomp://msg.cro-ngi.hr:6163/
>>> Message Broker destination:
>>> /queue/grid.probe.metricOutput.EGEE.ngi-de-nagios_gridka_de
>>> 2012-10-12 16:04:52,643 ConnStomp INFO slept in message dispatch loop.
>>> ./nagrun.sh: line 423: 2234 Segmentation fault $NAGROOT/bin/nagios
>>> $NAGCONF>  $FNAGPID 2>&1
>>>
>>> Nagios Core 3.3.1
>>> Copyright (c) 2009-2011 Nagios Core Development Team and Community
>>> Contributors
>>> Copyright (c) 1999-2009 Ethan Galstad
>>> Last Modified: 07-25-2011
>>> License: GPL
>>>
>>> Website: http://www.nagios.org
>>> Nagios 3.3.1 starting... (PID=2234)
>>> Local time is Fri Oct 12 16:04:51 CEST 2012
>>> 2012-10-12 16:04:53,645 ConnStomp INFO slept in message dispatch loop.
>>> 2012-10-12 16:04:54,647 ConnStomp INFO slept in message dispatch loop.
>>> 2012-10-12 16:04:55,649 ConnStomp INFO slept in message dispatch loop.
>>> 2012-10-12 16:04:56,650 ConnStomp INFO slept in message dispatch loop.
>>> 2012-10-12 16:04:57,664 ConnStomp INFO slept in message dispatch loop.
>>> 2012-10-12 16:04:58,666 ConnStomp INFO slept in message dispatch loop.
>>> 2012-10-12 16:04:59,667 ConnStomp INFO slept in message dispatch loop.
>>> 2012-10-12 16:05:00,669 ConnStomp INFO slept in message dispatch loop.
>>> 2012-10-12 16:05:01,670 ConnStomp INFO slept in message dispatch loop.
>>> 2012-10-12 16:05:02,672 ConnStomp INFO slept in message dispatch loop.
>>> WARNING: No Nagios status.dat file after 10 sec.
>>> WARNING:
>>> /home/ops03/home_cream_296803620/CREAM296803620/nagios/var/status.dat
>>> WARNING: Cannot proceed. Check WN. Bailing out.
>>> 2012-10-12 16:05:02,692 ConnStomp INFO Statistics:
>>> Connections: 0
>>> Messages sent: 0
>>> Receipts received: 0
>>> Messages received: 0
>>> Errors: 0
>>
>>
>
>


-- 
Dr. Alessandro Paolini
INFN - CNAF
Viale Berti Pichat 6/2
40127 Bologna
Italy
tel: +39 051 6092723
fax: +39 051 6092916
ICQ: 192172027
skype: alex.paolini
**********************
"credo nel potere del riso e delle lacrime"
    "come antidoto all'odio ed al terrore"
         "un giorno senza un sorriso"
              "è un giorno perso" >>> Charlie Chaplin