Dear Catalin,
I think I havenīt understood the problem. Are you saying that the bdii-update process is hanging?
Could we follow this up in a GGUS ticket? Please, specify as usual the OS, bdii package version you are running, the contents of /etc/bdii/bdii.conf and the output of the top command (Just the first 6 lines).
Thanks a lot,
Maria
> -----Original Message-----
> From: LHC Computer Grid - Rollout [mailto:[log in to unmask]] On
> Behalf Of Catalin Condurache
> Sent: 28 August 2013 15:23
> To: [log in to unmask]
> Subject: Re: [LCG-ROLLOUT] topBDII issues
>
> As an update to my issue
>
> While the node appears being stuck during the 'Logging errors' the only 'active'
> related process was
>
> /usr/bin/python /usr/sbin/bdii-update -c /etc/bdii/bdii.conf -d
>
>
> [root@lcgbdii03 ~]# ps axfww|grep bdii
> 5523 pts/0 S+ 0:00 \_ grep bdii
> 31271 ? Ssl 0:32 /usr/sbin/slapd -f /etc/bdii/bdii-top-slapd.conf -h
> ldap://0.0.0.0:2170 -u ldap
> 31278 ? S 1:00 /usr/bin/python /usr/sbin/bdii-update -c /etc/bdii/bdii.conf
> -d
> 4470 ? S 0:00 \_ sh -c ldapadd -d 256 -x -c -h localhost -p 2170 -D o=glue
> -w d3VY8cwlr >/dev/null 2>/var/lib/bdii/add.err
>
> [root@lcgbdii03 ~]# ps axfww|grep slap
> 5526 pts/0 S+ 0:00 \_ grep slap
> 31271 ? Ssl 0:32 /usr/sbin/slapd -f /etc/bdii/bdii-top-slapd.conf -h
> ldap://0.0.0.0:2170 -u ldap
>
> [root@lcgbdii03 ~]# strace -p 31271
> Process 31271 attached - interrupt to quit futex(0x7f96f28bd9d0, FUTEX_WAIT,
> 31283, NULL^C <unfinished ...> Process 31271 detached
>
> [root@lcgbdii03 ~]# strace -p 31278
> Process 31278 attached - interrupt to quit write(4, "ronmentappname: VO-atlas-
> AtlasPh"..., 4096) = 4096 write(4, "info: InfoProviderHost=creamce.i"..., 4096) =
> 4096 write(4, "therinfo: InfoProviderHost=baaf0"..., 4096) = 4096 write(4, "n-
> 16.6.2.2-i686-slc5-gcc43-opt_l"..., 4096) = 4096 write(4, "as-production-
> 17.2.0.4-i686-slc5"..., 4096) = 4096 write(4, "08-
> 28T12:34:53Z\nglue2entityother"..., 4096) = 4096 write(4,
> "Z\nglue2entityotherinfo: InfoProv"..., 4096) = 4096 write(4,
> "7383,GLUE2GroupID=resource,GLUE2"..., 4096) = 4096 write(4,
> "ntity\nobjectclass: GLUE2Applicat"..., 4096) = 4096 write(4,
> "vironment\nglue2entitycreationtim"..., 4096) = 4096 write(4,
> "E2ResourceID=clrccece02.in2p3.fr"..., 4096) = 4096 write(4,
> "lement_Manager\nglue2applicatione"..., 4096) = 4096 write(4, "-
> lcg.cr.cnaf.infn.it\nglue2applic"..., 4096) = 4096 write(4, "2013-08-
> 28T12:48:57Z\nglue2entity"..., 4096^C <unfinished ...> Process 31278 detached
>
> [root@lcgbdii03 ~]# strace -p 4470
> Process 4470 attached - interrupt to quit wait4(-1, ^C <unfinished ...> Process
> 4470 detached
>
> [root@lcgbdii03 ~]#
>
>
> Regards,
> Catalin
>
>
>
> > -----Original Message-----
> > From: LHC Computer Grid - Rollout [mailto:[log in to unmask]]
> > On Behalf Of Catalin Condurache
> > Sent: 28 August 2013 13:23
> > To: [log in to unmask]
> > Subject: [LCG-ROLLOUT] topBDII issues
> >
> > Hi,
> >
> > I am experiencing problems with the topBDII service at RAL. Two out of
> > three nodes (part of the lcgbdii.gridpp.rl.ac.uk alias) are apparently
> > hanging while 'logging errors' (/var/log/bdii/bdii-update.log) and are
> > not accessible for ldap queries.
> >
> > 2013-08-28 13:09:50,772: [DEBUG] Doing Fix
> > 2013-08-28 13:10:09,882: [DEBUG] Writing new_ldif to disk
> > 2013-08-28 13:10:10,486: [INFO] Reading old LDIF file ...
> > 2013-08-28 13:10:10,486: [DEBUG] Starting Diff
> > 2013-08-28 13:10:29,248: [DEBUG] Finished Diff
> > 2013-08-28 13:10:29,249: [DEBUG] Sorting Add Keys
> > 2013-08-28 13:10:30,551: [DEBUG] Writing ldif_add to disk
> > 2013-08-28 13:10:32,184: [DEBUG] Adding New Entries
> > 2013-08-28 13:10:32,520: [DEBUG] Logging Errors
> >
> >
> > Restarting the service (or even rebooting the nodes) didn't improve
> > the situation.
> >
> > In the past we correlated similar behaviour to network disruptions,
> > but no such thing today (as far as we know), and also a 'bdii restart'
> > used to work in the past.
> >
> > I am running bdii-5.2.17-2.el6.noarch
> >
> > Any help or idea much appreciated.
> >
> > Many thanks,
> > Catalin Condurache
> > RAL Tier1 Grid Services
> >
> > --
> > Scanned by iCritical.
> --
> Scanned by iCritical.
|