Some times ago there has been a thread
GlueVOViewLocalID in config_gip
in LCG-ROLLOUT
about a problem in the information system, caused by
initial versions of lcg-yaim in 2_6_0 - bug #10295.
It seems to me many sites that upgraded before this problem was fixed still publish
unnecessary stuff in the information system, putting unnecessary load on top-level BDIIs and their own GIISes.
A site that would normally publish about 50 entries publishes >100 due to the O(n^2) number of entries
generated by config_gip in the old version.
By running
ldapsearch -x -H ldap://lcg-bdii.cern.ch:2170 -b o=grid|egrep "dteam.*atlas"
and deleting some entries that do not seem to be relevant to the problem, I got the
list of sites that seem to be affected and support ATLAS:
ce.phy.bg.ac.yu
ce.ui.savba.sk
ce-iep-grid.saske.sk
ce01.pic.es
ce01-lcg.projects.cscs.ch
fal-pygrid-18.lancs.ac.uk
fornax-ce.itwm.fhg.de
gdsuf.phys.ufl.edu
gw39.hep.ph.ic.ac.uk
ifaece01.pic.es
ingvar.nsc.liu.se
lcgce.ijs.si
lcgce01.phy.bris.ac.uk
lcg00125.grid.sinica.edu.tw
node001.grid.auth.gr
pcncp04.ncp.edu.pk
t2-ce-01.to.infn.it
Testing for ESR, biomed, CMS gave me additionally
ce.grid.tuke.sk
egeece.ifca.org.es
ekp-lcg-ce.physik.uni-karlsruhe.de
hephygr.oeaw.ac.at
xg009.inp.demokritos.gr
The unnecessary entries look like this:
# dteam, ce01.pic.es:2119/jobmanager-lcgpbs-atlas, pic, local, grid
dn: GlueVOViewLocalID=dteam,GlueCEUniqueID=ce01.pic.es:2119/jobmanager-lcgpbs-
atlas,mds-vo-name=pic,mds-vo-name=local,o=grid
objectClass: GlueCETop
objectClass: GlueVOView
objectClass: GlueCEInfo
objectClass: GlueCEState
objectClass: GlueCEAccessControlBase
objectClass: GlueCEPolicy
objectClass: GlueKey
objectClass: GlueSchemaVersion
GlueVOViewLocalID: dteam
GlueCEAccessControlBaseRule: VO:dteam
GlueCEStateRunningJobs: 0
GlueCEStateWaitingJobs: 0
GlueCEStateTotalJobs: 0
GlueCEStateFreeJobSlots: 0
GlueCEStateEstimatedResponseTime: 0
GlueCEStateWorstResponseTime: 0
GlueCEInfoDefaultSE: castorsrm.pic.es
GlueCEInfoApplicationDir: /nfs/sw/dteam/pic
GlueCEInfoDataDir: /castor/pic.es/grid/dteam
GlueChunkKey: GlueCEUniqueID=ce01.pic.es:2119/jobmanager-lcgpbs-atlas
GlueSchemaVersionMajor: 1
GlueSchemaVersionMinor: 2
that is why I searched for something like:
# dteam, gdsuf.phys.ufl.edu:2119/jobmanager-condor-usatlas, local, grid
# dteam, lcgce.ijs.si:2119/jobmanager-pbs-atlas, SiGNET, local, grid
# dteam, ifaece01.pic.es:2119/jobmanager-lcgpbs-atlas, ifae, local, grid
# dteam, ingvar.nsc.liu.se:2119/jobmanager-lcgpbs-atlas, nsc, local, grid
# dteam, fornax-ce.itwm.fhg.de:2119/jobmanager-lcgpbs-atlas, ITWM, local, gri
# dteam, pcncp04.ncp.edu.pk:2119/jobmanager-lcgpbs-atlas, NCP-LCG2, local, gr
# dteam, ce.phy.bg.ac.yu:2119/jobmanager-pbs-atlas, AEGIS01-PHY-SCL, local, g
# dteam, ce.ui.savba.sk:2119/jobmanager-pbs-atlas, IISAS-Bratislava, local, g
# dteam, gw39.hep.ph.ic.ac.uk:2119/jobmanager-lcgpbs-atlas, IC-LCG2, local, g
# dteam, node001.grid.auth.gr:2119/jobmanager-lcgpbs-atlas, GR-01-AUTH, local
# dteam, t2-ce-01.to.infn.it:2119/jobmanager-lcgpbs-atlas, INFN-TORINO, local
# dteam, ce-iep-grid.saske.sk:2119/jobmanager-lcgpbs-atlas, IEPSAS-Kosice, lo
# dteam, ce01-lcg.projects.cscs.ch:2119/jobmanager-lcgpbs-atlas, CSCS-LCG2, l
# dteam, fal-pygrid-18.lancs.ac.uk:2119/jobmanager-lcgpbs-atlas, Lancs-LCG2,
# dteam, lcgce01.phy.bris.ac.uk:2119/jobmanager-lcgpbs-atlas, BRISTOL-PP-LCG,
# dteam, lcg00125.grid.sinica.edu.tw:2119/jobmanager-lcgpbs-atlas, Taiwan-LCG
.............
I believe this does not cause a problem for Job Submission or Replica Management, but
I think the increase in number of entries in both site GIISes and consequently top-level BDIIs
must be avoided.
I wonder how this can be fixed though?
Emanouil Atanassov
[log in to unmask]
|