Hi John,
> I believe that each batch queue could be declared a a subcluster, so
> that the correct information is also published (including OS etc).
> [Stephen (Burke), is my assumption correct?]. This may require a change
> to the GIP configuration. I have not implemented a subcluster solution
> as the nodes are a similar SPEC to those published by the site.
I've experimented with the SubCluster entity of the GLUE schema and
defined different queues on two sub-clusters with very different specs at
Birmingham. Indeed, I had to change the GIP configuration: I rewrote
lcg-info-generic.conf (that was back in 2.6 days, now this file is
completely different). The information I published seemed consistent (and
gstat didn't complain about it) and my job submission tests to both
sub-clusters were successful. However, I ran into some problems with Apel:
Though Apel did successfully retrieve the cluster and sub-cluster ids, it
didn't use them in the join used to create the final lcg accounting
record. I discussed this with Dave Kant and this solution would have
required some changes to the database schema (which naturally doesn't fit
in a production environment, even more so when it can only benefit one
site). We found an alternative solution which consists in splitting the
pbs log files and running apel twice over the split log files. The
distinction between the sub-clusters is then achieved using different site
names in the Apel config files. The convention we agreed on for the site
names was to add a prefix to the site name to distinguish between the
sub-clusters.
Well, I hope this will help a bit anyone interested in the sub-cluster
approach. If you do, I strongly advise that you consult Dave first.
There were also some issues with publishing the VO software, and at
the end, I opted for the two CE solutions.
Yves
|