Heya,
All my version numbers seem correct on those packages. I removed the
lcg-info packages and when I tried to rerun config_gip I get errors:
INFO: Executing function: config_gip
Adding user edginfo to group infosys
Adding user edguser to group infosys
Can't open static ldif file: /opt/lcg/etc/GlueService.template
Can't open static ldif file: /opt/lcg/etc/GlueSE.template
So it's still looking in /opt/lcg. config_bdii ran without errors, as
did config_gip_dcache, so there's some things look good. The yaim
versions we're using are:
[root@fal-pygrid-20 ~]# rpm -qa | grep yaim
glite-yaim-dcache-4.0.0-10
glite-yaim-core-4.0.3-13
glite-yaim-bdii-4.0.2-2
However after these changes we still have the dodgey looking slapd
process that Peter found:
ldap 12108 0.0 0.0 13500 2544 ? Ssl 13:47 0:00
/usr/sbin/slapd -u ldap -h ldap:///
cheers,
Matt
On 05/03/2008, Greig Alan Cowan <[log in to unmask]> wrote:
> Matt,
>
> Can you remove all of the lcg-info packages and re-run YAIM. There's
> clearly been some confusion between /opt/lcg and /opt/glite . Which
> version of the BDII are you running? Make sure that it is the SL3 one.
> You should have this stuff:
>
> bdii >= 3.9.1
> glite-info-generic >= 2.0.1
> glite-info-provider-ldap >= 1.0.0
>
> glite-info-templates >= 1.0.0
>
> glue-schema >= 1.3.0
> openldap-servers
>
>
> Greig
>
>
> On 05/03/08 13:18, Matt Doidge wrote:
> > Thanks Greig and Chris,
> >
> > Ldapsearching against our SE gives us a "No ldap server" type error.
> > Looking at slapd processes on my SE I see nothing on port 2170 but
> > lots listening near it:
> >
> > edguser 11689 0.3 0.2 74732 5004 ? S 13:08 0:00
> > /usr/sbin/slapd -f /opt/bdii//var/2173/bdii-slapd.conf -h
> > ldap://localhost:2173 -u edguser
> > edguser 12303 0.6 0.2 76772 5000 ? S 13:08 0:00
> > /usr/sbin/slapd -f /opt/bdii//var/2171/bdii-slapd.conf -h
> > ldap://localhost:2171 -u edguser
> > edguser 13307 2.0 0.2 76780 4996 ? S 13:09 0:00
> > /usr/sbin/slapd -f /opt/bdii//var/2172/bdii-slapd.conf -h
> > ldap://localhost:2172 -u edguser
> >
> > Looking at /opt/glite/etc/gip/glite-info-generic.conf I see;
> > temp_dir = /opt/glite/var/tmp/gip
> > cache_dir = /opt/glite/var/cache/gip
> > lock_dir = /opt/glite/var/lock/gip
> > plugin_dir = /opt/glite/etc/gip/plugin
> > static_dir = /opt/glite/etc/gip/ldif
> > provider_dir = /opt/glite/etc/gip/provider
> > freshness = 60
> > cache_ttl = 300
> > response = 110
> > timeout = 150
> >
> > and looking at my packages I see:
> > glite-info-provider-ldap-1.1.0-1
> > glite-info-templates-1.0.0-8
> > glite-info-update-endpoints-1.0.0-5
> > glite-info-generic-2.0.2-3
> > glite-info-plugin-fcr-1.0.0-3
> > lcg-info-1.11.0-1
> > lcg-info-templates-1.0.15-1
> > lcg-info-provider-software-1.0.6-1
> > lcg-info-dynamic-software-1.0.3-3
> >
> > I've ran all the yaim config scripts that I thought were relevent-
> > config_gip, config_bdii and config_gip_dcache.
> >
> > cheers,
> > Matt
> >
> > On 05/03/2008, Greig Alan Cowan <[log in to unmask]> wrote:
> >> What are the contents of: /opt/glite/etc/gip/glite-info-generic.conf
> >>
> >> What packages do you have installed?
> >>
> >> glite-info-generic-2.0.2-2.noarch
> >> glite-info-templates-1.0.0-4.noarch
> >> lcg-info-dynamic-dpm-2.2-2.noarch
> >> lcg-info-generic-1.0.22-1_sl3.noarch
> >> lcg-info-templates-1.0.15-1.noarch
> >>
> >> You should also run the YAIM config_gip and config_bdii functions on the
> >> dCache head node.
> >>
> >>
> >> Greig
> >>
> >>
> >> On 05/03/08 12:55, Matt Doidge wrote:
> >> > Hello,
> >> > /opt/glite/var/tmp/gip/infoDynamicSE-plugin-dcache.ldif.5079.3861
> >> > doesn't exist. The directory /opt/glite/var/tmp/gip/ contains more
> >> > empty directories; ldif plugin provider
> >> >
> >> > In /opt/bdii/var/tmp there's a GIP.ldif which seems to contain the
> >> > site infomation and the error file:
> >> > [root@fal-pygrid-20 ~]# cat /opt/bdii/var/tmp/stderr.log
> >> > bdb_initialize: Sleepycat Software: Berkeley DB 4.2.52: (December 3, 2003)
> >> > bdb_initialize: Sleepycat Software: Berkeley DB 4.2.52: (December 3, 2003)
> >> > could not open config file "/opt/lcg/schema/ldap/SiteInfo.schema": No
> >> > such file or directory (2)
> >> > slapadd: bad configuration file!
> >> >
> >> > Thanks for the help,
> >> > Matt
> >> > On 05/03/2008, Greig Alan Cowan <[log in to unmask]> wrote:
> >> >> So, does
> >> >>
> >> >>
> >> >> /opt/glite/var/tmp/gip/infoDynamicSE-plugin-dcache.ldif.5079.3861
> >> >>
> >> >>
> >> >> exist? If so, what are it's contents? What else is in that directory?
> >> >>
> >> >> I meant to say /opt/bdii/var/tmp. There should be a GIP.ldif and
> >> >> sterr.out file in there.
> >> >>
> >> >>
> >> >> Greig
> >> >>
> >> >>
> >> >> On 05/03/08 12:10, Matt Doidge wrote:
> >> >> > Things look like their running but not properly. From the bdii logs:
> >> >> > ----------------------------------------------------
> >> >> > Wed Mar 5 12:01:46 GMT 2008
> >> >> > Sleeping for 30
> >> >> >
> >> >> > Updating DB on port 2172
> >> >> > Waiting 115 s for query results.
> >> >> >
> >> >> > Time for searches: 0 s
> >> >> > current port: 47196 - OK
> >> >> > skipping 'dn: GlueServiceUniqueID=httpg://fal-pygrid-20.lancs.ac.uk:8443/srm/managerv2,mds-vo-name=resource,o=grid'
> >> >> > Time to update DB: 1 s
> >> >> > current port: 47197 - OK
> >> >> > ldap_bind: Can't contact LDAP server (-1)
> >> >> > Time to load DB: 60 s
> >> >> > Grabbing port 2170 for 2172
> >> >> > Error for sh: /opt/glite/var/tmp/gip/infoDynamicSE-plugin-dcache.ldif.5079.3861:
> >> >> > No such file or directory
> >> >> > ==> slapadd: bad configuration file!
> >> >> > -------------------------------------------------
> >> >> >
> >> >> > I don't seem to have a /opt/bdii/tmp dir. I also looked in the
> >> >> > bdii-fwd.log and saw entries like:
> >> >> > ----------------------------------
> >> >> > remote server: IO::Socket::INET: connect: Connection refused at
> >> >> > /opt/bdii//sbin/bdii-fwd line 167.
> >> >> > 20080305_120520 [Connecting to localhost...20080305_120520 Forked
> >> >> > process 4094 -> 2171
> >> >> > 20080305_120520 Reaped process 4094 (genNr 27)
> >> >> > 20080305_120551 [Connect from 194.80.35.23:53175]
> >> >> > -----------------------------------
> >> >> > 194.80.35.23 is our site-bdii (fal-pygrid-17.lancs.ac.uk).
> >> >> >
> >> >> > cheers,
> >> >> > Matt
> >> >> >
> >> >> > On 05/03/2008, Greig Alan Cowan <[log in to unmask]> wrote:
> >> >> >> Hi Matt,
> >> >> >>
> >> >> >> Are you sure that the BDII is running on the dCache node?
> >> >> >>
> >> >> >> $ ldapsearch -LLL -x -H ldap://fal-pygrid-20.lancs.ac.uk:2170 -b
> >> >> >>
> >> >> >> mds-vo-name=resource,o=grid
> >> >> >>
> >> >> >> ldap_bind: Can't contact LDAP server (-1)
> >> >> >>
> >> >> >> $ telnet fal-pygrid-20.lancs.ac.uk 2170
> >> >> >> Trying 194.80.35.12...
> >> >> >> Connected to fal-pygrid-20.lancs.ac.uk (194.80.35.12).
> >> >> >> Escape character is '^]'.
> >> >> >> Connection closed by foreign host.
> >> >> >>
> >> >> >> Compare with:
> >> >> >>
> >> >> >> $ telnet srm.glite.ecdf.ed.ac.uk 2170
> >> >> >> Trying 129.215.95.134...
> >> >> >> Connected to srm.glite.ecdf.ed.ac.uk (129.215.95.134).
> >> >> >> Escape character is '^]'.
> >> >> >> ^]
> >> >> >> telnet> Connection closed.
> >> >> >>
> >> >> >> What do the BDII logs in /opt/bdii/var/bdii.log and in the bdii/tmp dir say?
> >> >> >>
> >> >> >> Cheers,
> >> >> >>
> >> >> >> Greig
> >> >> >>
> >> >> >>
> >> >> >> On 05/03/08 11:48, Matt Doidge wrote:
> >> >> >> > Hello,
> >> >> >> >
> >> >> >> > Thanks for all the pointers, it really clarified things. I'm almost
> >> >> >> > there now. Running /opt/glite/libexec/glite-info-wrapper gives me all
> >> >> >> > a lot of text that looks right, plus all the bdii gubbins seem to be
> >> >> >> > running. We just don't seem to be pushing info to the site bdii. I
> >> >> >> > poked relevant holes in both firewalls, and editied the
> >> >> >> > bdii-update.conf on the site bdii to have the line:
> >> >> >> > SRM ldap://fal-pygrid-20.lancs.ac.uk:2170/mds-vo-name=resource,o=grid
> >> >> >> >
> >> >> >> > One clue from the logs on our site bdii:
> >> >> >> > SRM: ldap_bind: Can't contact LDAP server
> >> >> >> >
> >> >> >> > Is there something else I should be running on the SRM node? Have I
> >> >> >> > missed something obvious?
> >> >> >> >
> >> >> >> > cheers,
> >> >> >> > Matt
> >> >> >> > On 04/03/2008, Greig Alan Cowan <[log in to unmask]> wrote:
> >> >> >> >> Hi Matt,
> >> >> >> >>
> >> >> >> >> Yes, things have all changed. Most things should be under /opt/glite now
> >> >> >> >> and you should be using the BDII on port 2170 as your information
> >> >> >> >> provider; globus-mds has now been retired. You should be using
> >> >> >> >> mds-vo-name=resource (making sure that your site BDII now points to this
> >> >> >> >> new endpoint).
> >> >> >> >>
> >> >> >> >> To understand the new system, you just need to follow the configuration
> >> >> >> >> files. Everything is controlled by
> >> >> >> >> /opt/glite/libexec/glite-info-wrapper. This calls all of the information
> >> >> >> >> providers and plugins. The configuration file that the info-wrapper uses
> >> >> >> >> contains pointer to where all of the plugins and providers live.
> >> >> >> >>
> >> >> >> >> Hope that all makes sense.
> >> >> >> >>
> >> >> >> >> Cheers,
> >> >> >> >>
> >> >> >> >> Greig
> >> >> >> >>
> >> >> >> >>
> >> >> >> >> On 04/03/08 21:21, Matt Doidge wrote:
> >> >> >> >> > 11 hours after putting the SL4 CD in the tray... (okay I did treat
> >> >> >> >> > myself to a long lunch break)
> >> >> >> >> >
> >> >> >> >> > And I still haven't cracked it. I *think* we just have the infosystem
> >> >> >> >> > to crack now. But I've been trying to crack it for an age. I think
> >> >> >> >> > I've got the similar problems to what the chaps at Liverpool had, this
> >> >> >> >> > SL4 beast feels very different from the SL3 critter that died
> >> >> >> >> > yesterday. What happened to globus-mds? Should I have edguser or
> >> >> >> >> > edginfo? Why did I end up installing a bdii? Why is nothing in
> >> >> >> >> > /opt/lcg/ anymore? I don't like change.
> >> >> >> >> >
> >> >> >> >> > I'm suffering from staring at the same problem for too long, but I
> >> >> >> >> > sure could do with some help tomorrow. But first food, and precious
> >> >> >> >> > sleep.
> >> >> >> >> >
> >> >> >> >> > cheers,
> >> >> >> >> > Matt
> >> >> >> >> >
> >> >> >> >> > On 04/03/2008, Greig Alan Cowan <[log in to unmask]> wrote:
> >> >> >> >> >> Arrrgh. Too bad Matt - there's always something with you guys!
> >> >> >> >> >>
> >> >> >> >> >> At least things were backed up so nothing much will have been lost. Let
> >> >> >> >> >> me know if you need a hand.
> >> >> >> >> >>
> >> >> >> >> >> Cheers,
> >> >> >> >> >>
> >> >> >> >> >> Greig
> >> >> >> >> >>
> >> >> >> >> >>
> >> >> >> >> >> On 04/03/08 10:48, Matt Doidge wrote:
> >> >> >> >> >> > Hello,
> >> >> >> >> >> >
> >> >> >> >> >> > This is just a heads up to the powers that be- we had an unexpected
> >> >> >> >> >> > power cut yesterday that ended up crunching the system disk of our
> >> >> >> >> >> > dcache SRM. We've tried to bring back the lost data from beyond the
> >> >> >> >> >> > grave but without success. Luckily the majority of our configs are
> >> >> >> >> >> > backed up, but we're likely to be down a fair chunk of the day.
> >> >> >> >> >> >
> >> >> >> >> >> > On the plus side, we finally got a long overdue upgrade to SL4 for our
> >> >> >> >> >> > head node... You've got to think positive in this business!
> >> >> >> >> >> >
> >> >> >> >> >> > cheers,
> >> >> >> >> >> > Matt
> >> >> >> >> >>
> >> >> >> >>
> >> >> >>
> >> >>
> >>
>
|