Hi, the problem of what is the available on a site, or what is RHXX or SLXX has been with us from the first days of edg. There the initial approach was to ask the few VOs that have been active what they want. The result was a minimalists node. Almost all dependencies that the OS satisfied was driven by the middleware and the experiment software mandated that all the systems needed permanent updates to satisfy the requirements of the VOs as they became active or changed their software. After some very emotional meeting (WE WANT THE WHOLE REDHAT!!!!! ) edg and later the early LCG took a more pragmatic approach. A reference system was defined that was used during integration and testing and that was forced on all the sites. This was not without conflicts with the experiments, but the situation was somewhat predictable for them. The sites, of course, could hardly accept a world with all the systems running identical versions. This collided with local users requirements and will collide with the reality of multiple (read MANY) VOs. During the data challenges 2004 the experiments (at least some of them) became very pragmatic. They started to ship almost everything they need with their software. This is not always efficient, but certainly gives predictable results. If you want they did a kind of poor man's user mode virtualization of the resources. If one looks a bit harder at this then it becomes more clear why even for a single VO in a real production environment at least this kind of control is needed. A typical use case is that inside a collaboration (VO for non HEPs) the researchers can't switch to new versions of their analysis code at the same time. Until a paper/thesis is finished it can be very confusing to switch. This means that at the same time the VO will have several list of versions of libs. that they require to be on the WNs. Multiply this with the number of VOs that sites already allow to use their resources and it becomes clear that even publishing the list of versions (ignoring for a while the security implications) is a nightmare, managing a site like this is just far too time consuming. In my view the sites should only have to install a minimal set of software. (In an ideal world gLite would have only trivial dependencies). The VOs then distribute independent from the application software releases of their preferred environment. These environments should be tagged and the tag name can be published. On sites with a strong affiliation with a VO these environments can be added to the WNs, but can be installed as well like application software. The VO publishes a compatibility matrix between their different environment versions and the different releases of their software. What is important for this lightweight virtualization to work: We have to improve the ease with which the VO's software managers can distribute their software to the sites. This includes the packaging of "environments". These packaged environments have to be made available to interested sites to allow them to install the software locally. We will need sufficient space to install all the versions in the shared space/locally A mechanism has to be put in place to select the wanted versions of environments and software in the jdl. This has to be used not only in the matchmaking process, but in addition it has to control the correct setup of the environment variables when the job starts, A system, with quite similar functionality, but for the selection of different middleware releases/flavors will be part of LCG-2-4-0. This should be possible to extend this for the use by the VOs. I am convinced that in the end we have to go even one step further and provide some virtual machine concept, or the computing culture on the grid has to go back to where it was in the golden days of F77 where the applications had almost no individual dependencies..... But this is not a practical solution in the near future (despite the fact that there are already some systems available (chos to name one)) As for running on completely different platforms (I mean different as IRIX, Windows, Mac OS-X, etc.) . Certainly possible to handle, but not a burning as the handling of different linux distributions. markus On Mar 1, 2005, at 4:22 PM, Laurence wrote: > Hi, > > We touched upon this issue during a recent Glue Schema discussion. > > The common consensus is that the tag published by the information > system > should be defined to be the output of a command such as what Steve T > suggested "/usr/bin/lsb_release -d". At > The value published in the information system should not be used to try > and work out if your application can run at a site. With most VOs, the > software manager will install the VO software on the site and publish a > tag in the RuntimeEnvironment so that the VOs jobs can then be steered > to that site. The information could be used to help the VO manger > install the software but they should run some kind of probe job at the > site to check if what they require is on the worker node. > > Laurence > > ************************************************************************ ******* Markus Schulz CERN IT