Rod Walker wrote:
> Hi,
> I suspect that sg01-lcg.cr.cnaf.infn.it:2170 is stuck, since it is
> publishing zero queued jobs despite many of my jobs queuing there (it's
> lsf so I can`t check for sure - where is qstat?).
> I`ve mailed the admins directly but if this is due to the bad bdii version
> distributed with 2.4.0 then I would think many sites still use this. It`s
> a particularly nasty bug as it can attract thousands of jobs to the
> affected site. As such I would say it`s a candidate for an "urgent patch",
> if such a thing exists.
For YAIM-based installations one can update the BDII as follows:
apt-get update && apt-get -y install lcg-BDII
or
yum install lcg-BDII
For LCFGng-based installations "wget" the rpm manually:
http://grid-deployment.web.cern.ch/grid-deployment/RpmDir/external/lcg-bdii-3.2.6-1.noarch.rpm
Copy it to /opt/local/linux/7.3/RPMS/external/ and do a "make" in that
directory. Update /opt/local/linux/7.3/rpmcfg-LCG-2_4_0/LCG-BDII-rpm.h
(or wherever that file sits) such that it has "lcg-bdii-3.2.6-1".
Then run "/etc/obj/updaterpms start" on all your service nodes.
Note: neither procedure will restart BDII processes, so you would have
to do that explicitly, particularly on top-level BDII nodes and CEs.
|