Markus,
I think this should be the right occasion to test the ScalableOpenPBS
version of PBS,
developed ad supercluster.org, that implements several patches and bugfixes
and is still free.
This could be the more "easy to implement" solution to this problem. Anyway
I am not 100% sure
this bug is fixed in that release.
Cheers,
Andrea
SCHULZ wrote:
> Hi,
> a big than you to Andrea and Gonzalo for their fast reaction.
>
>
> After a lot of tracing we finally found out why the IS in the east
> region came to a still stand.
>
> The reason was the following:
>
> A single worker node at CERN that is (was) connected to the CE adc0015
> was in a strange state that resulted in making the PBS call qstat
> block.
>
> This call is used by the script that populates the local MDS
> This made the local MDS made the site GIIS block.
>
> Since the site GIIS is registered with both regional MDSes it
> resulted in preventing the population of the two regional GIISes.
>
> As a result we saw only the western sites in the BDII.
>
>
> This is very stupid and we will do something about it very soon.
>
>
> markus
|