On 02/29/2012 06:45 AM, Eygene Ryabinkin wrote:
> Tomas, good day.
>
> Tue, Feb 28, 2012 at 10:49:41AM +0100, Tomas Kouba wrote:
>> Since last week we experience problems with the machine being
>> overloaded. I am not able to find out the cause of this, can
>> anybody please help?
>>
>> What I have checked so far:
>> - there was no package update (according to /var/log/yum.log)
>
> What top-level BDII version you're running? gLite or UMD-1?
Both instances run glite-BDII_top-3.2.12-1.sl5 from
http://glitesoft.cern.ch/EGEE/gLite/R3.2/glite-BDII_top/
>
>> - the network load is not significantly higher than it used to be
>> (according to munin plugin for eth0 and according to netflow data)
>> - the most "greedy" process is slapd2.4, top output with threads on:
>> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
>>
>> 10601 ldap 16 0 6185m 3.4g 977m S 27.0 43.5 497:27.90 slapd2.4
>>
>> 853 ldap 15 0 6185m 3.4g 977m R 24.3 43.5 492:51.77 slapd2.4
>>
>> 16252 ldap 16 0 6185m 3.4g 977m R 22.6 43.5 486:52.14 slapd2.4
>>
>> 7930 ldap 16 0 6185m 3.4g 977m R 20.3 43.5 467:33.52 slapd2.4
>>
>> 6928 ldap 16 0 6185m 3.4g 977m S 19.6 43.5 496:14.12 slapd2.4
>>
>> 10604 ldap 16 0 6185m 3.4g 977m R 19.3 43.5 468:15.89 slapd2.4
>>
>> 4827 ldap 16 0 6185m 3.4g 977m S 18.3 43.5 472:26.45 slapd2.4
>>
>> 16253 ldap 15 0 6185m 3.4g 977m S 15.0 43.5 476:32.90 slapd2.4
>
> Are there any ldapadd/ldapmodify processes?
No, but there are lot of ldapsearch processes from time to time, but they usually finish
in few seconds.
>
>> - /etc/init.d/bdii restart helps for few hours
>
> The first thing that I'd check is if it's a problem that comes from
> the bdii-update (that modifies LDAP tree) or the problem that appears
> when LDAP serves its clients (although you see no significant increase
> in the traffic, there can be problems with the guts of OpenLDAP or
> component near it). For this, I'd first try to raise BDII_LOG_LEVEL
> to DEBUG and check bdii-update.log.
>
> It will also be interesting so see how reply latency behaves after
> restart: you can try to run 'ldapsearch -s one -b
> mds-vo-name=local,o=grid' every minute and check the latency over
> time: this might give some hints.
Thank you for the hints, I will try it.
Best regards,
--
Tomas Kouba
Institute of Physics, Academy of sciences of the Czech Republic
|