Print

Print


Hi Maarten,


thanks for the prompt reply, i am applying the patch on the CE later and 
monitoring the status for another few days. to save time fixing the 
problem, i have forced killing all defunct proc earlier.

keep you posted then.

thanks

BR,
J

Maarten Litmaath wrote:
> Hi Jason,
> 
>> from one of the CEs in our region, we found around 32k defunct 
>> globus-gma processes and cause continuous job submission failures. any 
>> idea how we able to suppress the number of defunct proc? or is this an 
>> known issue and already have patch issued?
>>
>> thanks
>>
>>
>> # ps axuww |grep globus-gma | wc -l
>> 31878
>>
>> # ps axuww |grep globus-gma | head -5
>> cms175     300  0.0  0.0     0    0 ?        Z    Feb04   0:00 
>> [globus-gma] <defunct>
>> cms175     301  0.0  0.0     0    0 ?        Z    Feb09   0:00 
>> [globus-gma] <defunct>
>> cms175     302  0.0  0.0     0    0 ?        Z    Feb01   0:00 
>> [globus-gma] <defunct>
>> cms175     303  0.0  0.0     0    0 ?        Z    Feb05   0:00 
>> [globus-gma] <defunct>
>> cms150     304  0.0  0.0     0    0 ?        Z    Feb09   0:00 
>> [globus-gma] <defunct>
> 
> This was recently discussed on LCG-ROLLOUT.  I include the summary here.
> 
> Andrey Kiryanov wrote:
> 
>  > Hi Torsten,
>  >
>  > Torsten Harenberg wrote:
>  >
>  >>> yes, it is a known issue :-(
>  >>> the patch suggested in the following ticket
>  >>> https://gus.fzk.de/ws/ticket_info.php?ticket=42981 should fix the
>  >>> problem
>  >>
>  >>
>  >> thanks for the quick reply. I installed the patch.
>  >
>  >
>  > Despite this, it means that you CE suffers from a high load and job poll
>  > processes hang. There may be various reasons for this.
>  > Please check if you have "Killing hung process" lines in
>  > /opt/globus/var/log/globus-gma.log
>  > If they are there, try to add a 'tout 120' line in
>  > /opt/globus/etc/globus-gma.conf and see it it helps.
> 

-- 
Jason Shih
ASGC/OPS
Tel: +886-2-2789-8311
Fax: +886-2-2783-7653