Hi Andrew,
There may be a number of reasons, so we need to debug this (maybe off
TB-SUPPORT?), for example:
Does:
/opt/lcg/libexec/lcg-info-wrapper | grep GlueCEStateWaitingJobs
returns the same info when run as root and edginfo.
As user edginfo, what does 'diagnose -g' returns?
...
Thank you,
Yves
On Thu, 18 Jan 2007, Andrew Beresford wrote:
> Hello Yves,
>
> I've run that and it seems to have improved things, but now for each VO
> we appear to have;
>
> GlueCEStateWaitingJobs: 4444
>
> That seems unlikely!
>
> Cheers,
>
> Andrew
>
> On Thu, 2007-01-18 at 12:30 +0000, Yves Coppens wrote:
> > Hi Andrew,
> >
> > You can recreate the static-file-CE.ldif from your lcg-info-static-ce.conf
> > as follows:
> >
> > /opt/lcg/sbin/lcg-info-static-create -c \
> > /opt/lcg/var/gip/lcg-info-static-ce.conf -t \
> > /opt/lcg/etc/GlueCE.template > \
> > /opt/lcg/var/gip/ldif/static-file-CE.ldif
> >
> > It looks as if you've overwritten the lcg-info-static-ce.conf
> > in /opt/lcg/var/gip created by yaim with the template
> > in /opt/lcg/share/doc/lcg-info-templates, hence your 999999 values for the
> > maximum number of jobs,...
> >
> > I think running the yaim config_gip function will solve your problem?
> >
> > Yves
> >
> >
> > On Thu, 18 Jan 2007, Andrew Beresford wrote:
> >
> > > Burke, S (Stephen) wrote:
> > > > Andrew Beresford [mailto:[log in to unmask]] said:
> > > >
> > > >> Can you identify which ones? How can I fix that?
> > > >>
> > > >
> > > > GlueCEInfoLRMSVersion: torque_1.0.1p5
> > > > GlueCEInfoLRMSVersion: torque_1.0.1p5
> > > > GlueCEInfoTotalCPUs: 158
> > > > GlueCEInfoTotalCPUs: 158
> > > > GlueCEStateFreeCPUs: 154
> > > > GlueCEStateFreeCPUs: 154
> > > > GlueCEStateStatus: Production
> > > > GlueCEStateStatus: Production
> > > > GlueCEPolicyMaxCPUTime: 2880
> > > > GlueCEPolicyMaxCPUTime: 2880
> > > > GlueCEPolicyMaxRunningJobs: 999999
> > > > GlueCEPolicyMaxTotalJobs: 999999
> > > > GlueCEPolicyMaxWallClockTime: 5760
> > > > GlueCEPolicyMaxWallClockTime: 5760
> > > >
> > > > At a guess the items are probably duplicated in the ldif file created by
> > > > the config. I think the relevant files are somewhere like
> > > > /opt/lcg/var/gip/ldif.
> > > >
> > > > Stephen
> > > >
> > > I can see the file in that directory, but I'm unsure how they get generated.
> > >
> > > Both lcg-info-static-ce.ldif and static-file-CE.ldif seem to define
> > > these entries;
> > >
> > > [root@lcgce0 ldif]# grep -ir LRMSVersion *
> > > lcg-info-static-ce.ldif:GlueCEInfoLRMSVersion: torque-1.0.1b
> > > lcg-info-static-ce.ldif:GlueCEInfoLRMSVersion: torque-1.0.1b
> > > lcg-info-static-ce.ldif:GlueCEInfoLRMSVersion: torque-1.0.1b
> > > lcg-info-static-ce.ldif:GlueCEInfoLRMSVersion: torque-1.0.1b
> > > lcg-info-static-ce.ldif:GlueCEInfoLRMSVersion: torque-1.0.1b
> > > lcg-info-static-ce.ldif:GlueCEInfoLRMSVersion: torque-1.0.1b
> > > lcg-info-static-ce.ldif:GlueCEInfoLRMSVersion: torque-1.0.1b
> > > lcg-info-static-ce.ldif:GlueCEInfoLRMSVersion: torque-1.0.1b
> > > lcg-info-static-ce.ldif:GlueCEInfoLRMSVersion: torque-1.0.1b
> > > static-file-CE.ldif:GlueCEInfoLRMSVersion: not defined
> > > static-file-CE.ldif:GlueCEInfoLRMSVersion: not defined
> > > static-file-CE.ldif:GlueCEInfoLRMSVersion: not defined
> > > static-file-CE.ldif:GlueCEInfoLRMSVersion: not defined
> > > static-file-CE.ldif:GlueCEInfoLRMSVersion: not defined
> > > static-file-CE.ldif:GlueCEInfoLRMSVersion: not defined
> > > static-file-CE.ldif:GlueCEInfoLRMSVersion: not defined
> > > static-file-CE.ldif:GlueCEInfoLRMSVersion: not defined
> > > static-file-CE.ldif:GlueCEInfoLRMSVersion: not defined
> > >
> > > The entries get defined under DNs like...
> > >
> > > dn:
> > > GlueCEUniqueID=lcgce0.shef.ac.uk:2119/jobmanager-lcgpbs-ops,mds-vo-name=local,o=grid
> > >
> > > Is that correct?
> > >
> > > Should I just remove one of these files - or are they both required?
> > >
> > > Cheers,
> > >
> > > Andrew
> > >
>
|