The WMS indeed chooses a queue, which maps to a cluster, which maps to a
single subcluster.
You want to remove this constraint and so the WMS should choose a queue
+ a subcluster, right ?
Fine, but then the problem is to submit the job to that queue (this is
simple) and impose the execution of that job only on a node that is part
of the chosen subcluster.
This latter part is the tricky part: it is possible but the concept of
subcluster is not "standard" and there is not a standard way to
partition the WNs in subclusters and submit a job on a specific subcluster.
This is however something that can be implemented in different ways for
different sites and different batch systems.
A possible approach could be using this stuff:
> https://wiki.italiangrid.it/twiki/bin/view/CREAM/UserGuide#3_2_Forward_of_requirements_to_t
The chosen subclusterid would be available in
xxx_local_submit_attributes.sh, and the site logic to be used to submit
the job to a WN of the chosen subcluster should be implemented in this
xxx_local_submit_attributes.sh script.
Does this make sense ?
Cheers, Massimo
On 02/22/2012 05:27 PM, Stephen Burke wrote:
> LHC Computer Grid - Rollout [mailto:[log in to unmask]] said:
>> 2) it is much less than clear to me if EMI CREAM is ready for glite
>> cluster. what i mean is something like this:
>
> One general point is that glite-cluster is purely about yaim configuration, it doesn't do anything you couldn't previously do with configuration by hand and it doesn't change the structure of the information in the bdii or the way that e.g. the WMS interprets it. CREAM as a service (as opposed to a node type) has no interaction at all with glite-cluster.
>
>> c) as far as I know there is no connection -- and here I mean a real,
>> deployed, in production connection, I do not mean a connection like "in
>> theory a site could write its own plugin for BLAH", nor do I mean "you
>> should define a separate queue for each subcluster" -- between CREAM
>> and the batch system that will make sure that torque knows we can only
>> submit to that particular subcluster. so the job most likely goes to a
>> different subcluster and gets killed for exceeding its walltime.
>
> This is a different question, and indeed that is still not supported (Massimo may like to comment on what will happen with glue 2 matching in the WMS). So with or without glite-cluster you still have the restriction of having only one subcluster per cluster, and hence per CE. What glite-cluster does let you do is publish each distinct subcluster only once, instead of once per CE node.
>
>> so unless i want to write BLAH plugins, or multiply the number of CE
>> endpoints we publish (and amount of junk in the BDII) by four (the
>> number of different subclusters we have), we should NOT install and use
>> glite-cluster.
>
> The number of CEs stays the same (unless you choose to change them for other reasons), the change is to reduce the number of Clusters and SubClusters by removing duplicate publication of the same information. And conversely, if you have four subclusters then you should *already* have multiplied the number of CEs by four, i.e. each CE should refer to one and only one subcluster.
>
> Stephen
|