I agree with the Mancunians. I don't know where you got the idea Stuart that GRAM worked with big clusters. Perhaps if the big cluster runs a single MPI job with another one or two in the queue but it was number of jobs that killed it. There was a process running for each and they talked to the LRMS all the time. This is the main thing LCG-CE addressed which let us run our clusters. As Andrew said, GRAM couldn't cope with even his 80 node cluster.
For me the bigger problem was that Globus was a toolkit and they never committed to develop it to cope with projects requirements. When EDG developed things because they needed them, new releases of GT would play catchup. So, MDS had features added which made it more like BDII. It would have been better if we had seen these things in their development plan and worked with them to produce them only once but that never seemed possible for IP and other issues.
I believe that the EMI strategy of not presenting a GRAM (or even a BES) interface is wrong as there are indeed many use cases for it, if were scalable. The LCG-CE (and other components) are now Globus Toolkit 4 Pre-Web Services version. Ie the latest version compatible with GT2.
JOhn
> -----Original Message-----
> From: Testbed Support for GridPP member institutes [mailto:TB-
> [log in to unmask]] On Behalf Of Alessandra Forti
> Sent: 29 September 2011 00:28
> To: [log in to unmask]
> Subject: Re: Early Adopters/Staged Rollout in the UK
>
> In the US they like to use it because its a US product and they get
> funding for that.
>
> In HEP it is not liked because it needed a lot of hacks to make it work
> on a big scale. If it hadn't been for CERN and the work they put to keep
> going the globus based lcg-CE it wouldn't have lasted that long.
>
> cheers
> alessandra
>
>
>
>
> On 28/09/2011 19:11, Stuart Purdie wrote:
> > Two, rather separate points:
> >
> > On 28 Sep 2011, at 14:43, Daniela Bauer wrote:
> >
> >> So far I had one volunteer (Raul) admitting to installing EMI software
> >> and be willing to submit a report about it who wasn't on the list
> >> before (to save you from clicking, here it is)
> >>
> >> UKI-SCOTGRID-GLASGOW CREAM EMI 1.0
> > Also run a Developer Special EMI DPM headnode. I suspect that this is
> the protoype EMI Preview case too. Most important detail: we're unlikely
> to ever be official SR people for DPM up here.
> >
> > ... oh yeah, EMI Preview - think the Early Early Adopters. Eventually
> we'll get to the point where the developers actually run the software
> themselves, which might speed up the process of bug catching, but for the
> moment the talk is of one stage closer to the developers...
> >
> > Seperatly:
> >
> >> Also EMI has come back to me, asking about IGE (globus) - apparently
> >> the UK are avid globus users, but I have to admit, this is the first I
> >> hear of it (IGE that is, somehow I was under the impression that
> >> globus is integrated in various bits of middleware and that was that).
> > Let me take a wander through the past, to give some background that (I
> hope) will be useful in placing these things in context.
> >
> > Globus is 'officially' a Toolkit, which can be use to build production
> infrastructure. It does, however, contain a number of components that can
> be deployed as-is, although it makes a few assumptions about things if you
> do that (e.g. no VOMS server support). Most of the current production grid
> infrastructure is based on Globus Toolkit 2 (GT2), and the lcg-CE is mostly
> hacked up GT2 Gatekeeper (think CE), to work with VOMS and other things.
> GT3 and GT4 were the Web Service wilderness years, and the problem was that
> few people wanted webservice stuff at the time, and those that did got
> burned when GT4 decided to use a totally incompatible state model, and also
> a totally incompatible wire format from GT3; and did it round around the
> point in time when GT3 stuff was just about mature enough to seriously
> deploy (i.e. just before the documentation was usably complete). Thus very
> little of GT3 or GT4 ever reached a production Grid. GT3 did introduce
> OGSA, the Open Grid Services Architecture, through which must time has been
> wasted waffling about Service Orientated Architecture [0].
> >
> > With GT5 they went back to the GT2 stuff, and backported all the various
> improvements over the years, and worked from there. It includes an updated
> Gatekeeper.
> >
> > People like the Globus Gatekeeper for, as far as I can see, three key
> reasons.
> >
> > 1. It is simple to deploy - one provider, one install, done.
> > 2. It is simple to use - You gridFTP files from A to B, and you submit
> your job to a cluster, and it runs there. That's it.
> > 3. It has a lot of mind share - for better or worse Ian Foster invented
> the term Grid [1], and as he runs the Globus project, there is a strong
> connection that Globus == Grid [2]
> >
> > The biggest reason (as far as I can tell) that the Globus Gatekeeper
> isn't so used for HEP over here is most strongly related to points 2, and
> weakly to 3 (i.e. the old Not Invented Here syndrome). The key reasons,
> however, is that the Gatekeeper is designed for larger clusters - the
> typical cluster size in the USA is 5000 to 80 000 cores. With an
> expectation of larger, and hence fewer, clusters, there are something that
> are different. Firstly - metascheduling - not in the normal tools. You
> pick a cluster, and send it there, manually. Although not relevant now
> (because of the growth of pilot jobs, moving this concern into experiment
> frameworks) Data dependant metascheduling was a big thrust of CE
> development work in EDG/EGEE.
> >
> > So: IGE
> >
> > IGE is taking the GT5 components, and modifying them to fit within a
> European Grid Context - i.e. VOMS, Argus, LCMAPS/LCAS on the security
> side, and I think the intent is to get the gLite WMS to work with an IGE
> Gatekeeper [3], and so on. It will also act as n-th level support (for
> some value of n) for people using it in Europe.
> >
> > The main intent of this is to have a set of services to kill off the
> older VDT/raw Globus stuff, that will co-exist with other Grid systems, for
> those that want to use pretty much basic Globus.
> >
> >
> > As has been mentioned, the NGS (specifically: Dave Wallom, as Technical
> director) is a fan of Globus. With the retireal of the National Four
> Clusters, we did see a noticeable increase in use of 'the' NGS WMS, we did
> see an increase in use of the gLite stack, but the general recommendation
> from NGS has (thus far) always been toward Globus.
> >
> > I do not know what the NGS's plans are, re IGE. It would be a natural
> technology for them to adopt, on paper; but there may well be other factors
> I'm not aware of.
> >
> >
> > Frankly, my opinion (with a GridPP hat on)? IGE is irrelevant to us
> unless and until WLCG / HEP experiments ask for it. It may have value for
> NGS type users, but I'm of the opinion that they can ask us, rather than
> the other way round (new technology should be driven from the User to the
> operations team, and not the other way round).
> >
> >
> > [0] And even more effort wasted. All the promise that people attribute
> to SOA, other than a simple "A Service is a Library, with multi languagae
> bindings", require most of the stuff normally called Semantic Web - and
> _that_ currently can only with withing the contexts of well defined
> Ontologies.
> > [1] Even although Miron Levny had ben doing it for years by then, with
> Condor.
> > [2] Despite the fact that Grid is a stupid name, and Globus fits less of
> the key aspects than many other things.
> > [3] From memory, may not be accurate.
|