Hi,
The error referenced in this ticket has 2 problems:-
1) it is a different error message to the "Submissions are disabled"
seen at Wuppertal
2) I expect this would result in all submissios always failing,
whereas the problem at Wuppertal is intermittent (albeit more failing
than working).
Torsten demonstrated that the load monitor is not the reason, or
perhaps it flip flops
[root@cream-ce security]# /usr/bin/glite_cream_load_
monitor /etc/glite-ce-cream-utils/glite_cream_load_monitor.conf --test
&& echo "job ok"
job ok
Is there anything else to cause the "disabled" message?
Cheers,
Rod.
On 13 December 2012 14:06, Eric Frizziero <[log in to unmask]> wrote:
> Hi Torsten,
>
>
> On 12/13/2012 01:59 PM, Torsten Harenberg wrote:
>>
>> Hi Massimo,
>>
>> Am 13.12.2012 um 13:40 schrieb Massimo Sgaravatto
>> <[log in to unmask]>:
>>
>>> /usr/bin/glite_cream_load_monitor
>>> /etc/glite-ce-cream-utils/glite_cream_load_monitor.conf --test
>>>
>>> and check the exit code
>>
>> [root@cream-ce security]# /usr/bin/glite_cream_load_monitor
>> /etc/glite-ce-cream-utils/glite_cream_load_monitor.conf --test && echo "job
>> ok"
>> job ok
>>
>> You parse df -P which looks like
>>
>> [root@cream-ce security]# df -P
>> Dateisystem 1024‐Blöcke Benutzt Verfügbar Kapazit. Eingehängt auf
>> /dev/mapper/vg_creamce-lv_root 51606140 10383496 38601204 22% /
>> tmpfs 5044420 0 5044420 0% /dev/shm
>> /dev/xvda1 495844 44835 425409 10% /boot
>> /dev/mapper/vg_creamce-lv_home 26423572 181640 24899676 1%
>> /home
>> sge-master.pleiades.uni-wuppertal.de:/sge-root 76147744 6802912
>> 65414336 10% /sge-root
>>
>> if you come from a ssh session from my (german) laptop. The system itself
>> in on
>>
>> [root@cream-ce security]# cat /etc/sysconfig/i18n
>> LANG="en_US.UTF-8"
>> SYSFONT="latarcyrheb-sun16"
>>
>> So:
>>
>> [root@cream-ce ~]# unset LC_MONETARY LC_NUMERIC LC_MESSAGES LC_COLLATE
>> LANG LC_CTYPE LC_TIME
>>
>> [root@cream-ce ~]# /usr/bin/glite_cream_load_monitor
>> /etc/glite-ce-cream-utils/glite_cream_load_monitor.conf --show
>> Threshold for Load Average(1 min): 40 => Detected value for Load Average(1
>> min): 2.25
>> Threshold for Load Average(5 min): 40 => Detected value for Load Average(5
>> min): 2.38
>> Threshold for Load Average(15 min): 20 => Detected value for Load
>> Average(15 min): 3.21
>> Threshold for Memory Usage: 95 => Detected value for Memory Usage: 76.51%
>> Threshold for Swap Usage: 95 => Detected value for Swap Usage: 20.37%
>> Threshold for Free FD: 500 => Detected value for Free FD: 989245
>> Threshold for tomcat FD: 800 => Detected value for Tomcat FD: 336
>> Threshold for FTP Connection: 30 => Detected value for FTP Connection: 1
>> Threshold for Number of active jobs: -1 => Detected value for Number of
>> active jobs: 1020
>> Threshold for Number of pending commands: -1 => Detected value for Number
>> of pending commands: 1
>> Threshold for Disk Usage: 95% => Detected value for Partition / : 22%
>> [root@cream-ce ~]#
>>
>> looks ok and I hope this is also what the process gets (without the stuff
>> coming through ssh).
>>
>> BTW: I don't think it could work at all if you have the system on
>> something else than english as
>>
>>
>> push (@list,`df -P / |grep -v Filesystem|awk -F" " '{ print \$6 }'`);
>> push (@list,`df -P /tmp |grep -v Filesystem|awk -F" " '{ print \$6
>> }'`);
>> push (@list,`df -P /var/lib/mysql |grep -v Filesystem|awk -F" " '{
>> print \$6 }'`);
>> push (@list,`df -P /opt |grep -v Filesystem|awk -F" " '{ print \$6
>> }'`);
>>
>> searches for "Filesystem" explicitly.
>>
>>
>> Earlier I disabled these checks in cream-config.xml completely and the
>> error changed into:
>>
>> 012 (3143048.018.000) 12/13 09:17:29 Job was held.
>> CREAM error: CREAM_Job_Register Error: Received NULL fault; the
>> error is due to another cause: FaultString=[CREAM service not available:
>> configuration failed!] - FaultCode=[SOAP-ENV:Server] -
>> FaultSubCode=[SOAP-ENV:Server]
>> Code 0 Subcode 0
>>
>> After several restarts of tomcat, it worked again, but this condition
>> comes back quite regularly.
>>
>> Maybe this tells you something?
>
> please, see: https://savannah.cern.ch/bugs/?98144
>
> Cheers,
> E r i c.
>
>
>
>> Thanks
>>
>> Torsten
>>
>> --
>> <><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>
>> <> <>
>> <> Dr. Torsten Harenberg [log in to unmask] <>
>> <> Bergische Universitaet <>
>> <> FB C - Physik Tel.: +49 (0)202 439-3521 <>
>> <> Gaussstr. 20 Fax : +49 (0)202 439-2811 <>
>> <> 42097 Wuppertal <>
>> <> <>
>> <><><><><><><>< Of course it runs NetBSD http://www.netbsd.org ><>
>>
>>
>
--
Tel. +49 89 289 14152
|