Hello,
we have two instances of top level BDII at different sites and we use DNS round robin to balance the
load
and also to have backup during an outage of one site.
Since last week we experience problems with the machine being overloaded.
I am not able to find out the cause of this, can anybody please help?
What I have checked so far:
- there was no package update (according to /var/log/yum.log)
- the network load is not significantly higher than it used to be (according to munin plugin for
eth0 and
according to netflow data)
- the most "greedy" process is slapd2.4, top output with threads on:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
10601 ldap 16 0 6185m 3.4g 977m S 27.0 43.5 497:27.90 slapd2.4
853 ldap 15 0 6185m 3.4g 977m R 24.3 43.5 492:51.77 slapd2.4
16252 ldap 16 0 6185m 3.4g 977m R 22.6 43.5 486:52.14 slapd2.4
7930 ldap 16 0 6185m 3.4g 977m R 20.3 43.5 467:33.52 slapd2.4
6928 ldap 16 0 6185m 3.4g 977m S 19.6 43.5 496:14.12 slapd2.4
10604 ldap 16 0 6185m 3.4g 977m R 19.3 43.5 468:15.89 slapd2.4
4827 ldap 16 0 6185m 3.4g 977m S 18.3 43.5 472:26.45 slapd2.4
16253 ldap 15 0 6185m 3.4g 977m S 15.0 43.5 476:32.90 slapd2.4
- /etc/init.d/bdii restart helps for few hours
- the memory is not being eaten too much [1]
- we use 3G tmpfs for /var/run/bdii/db and it is not full:
Filesystem Size Used Avail Use% Mounted on
tmpfs 3.0G 1.8G 1.2G 61% /var/run/bdii/db
[1] http://monitor.farm.particle.cz/munin/farm.particle.cz/bdii1.farm.particle.cz.html#System
Thank you for any hint,
--
Tomas Kouba
Institute of Physics, Academy of sciences of the Czech Republic
|