On Wed, 1 Aug 2007, Oliver Keeble wrote:
> Hi Andreas,
>
> First a couple of specifics - I know of nothing on the WN which requires
> you to install Castor, nor to use DAG. Please give details.
>
> Second, on the topic of using third party repositories, we've received a
> lot of feedback that the 'externals' repositories we have had up to now
> are not acceptable as they are unmaintained. Thus we are moving to the
> only alternative available given our resources - relying on repositories
> maintained by 3rd parties (currently only jpackage, but possibly in
> future including DAG too). CERN maintains mirrors of both of these which
> you are free to use.
Hi Oliver,
What a lot of software (jpackage, and worse yet, dag) to get a handful
of RPMs that could be kept in the glite external area at CERN..
As a test this morning while eating my breakfast I created a minimal
repository from the jpackage repo to see what extras would be needed to
install a glite 3.1 32-bit worker node on a fresh SL 4 Update 5 (not SLC4)
installation:
# yum install glite-WN glite-TORQUE_client
Except for jdk 1.5, the packages not provided by SL4 and the
glite 3.1 repo at CERN are:
classpathx-jaxp-1.0-0.1.beta1.10jpp.noarch.rpm
log4j-1.2.14-3jpp.noarch.rpm
perl-SOAP-Lite-0.65.6-1.noarch.rpm
xml-commons-1.3.03-10jpp.noarch.rpm
xml-commons-jaxp-1.3-apis-1.3.03-10jpp.noarch.rpm
and perl-SOAP-Lite I had to scrounge from CERN SLC4 since it is not in the
main body of SL4 RPMs.
This isn't a big set of RPMs, IMO. Just my 2 cents worth..
Note that the jpackage repo offered 2 RPMs that satisfy the
dependency:
jaxp_parser_impl
Those 2 packages are:
classpathx-jaxp and crimson
-- which is the correct one?
cheers,
denice
> This topic is on the agenda at the next EGEE Ops meeting. A decision
> must be made soon about whether DAG is acceptable as the services we are
> preparing for SL4 frequently have deps which could be satisfied there.
>
> Oliver.
>
> Oliver Keeble Information Technology Department
> [log in to unmask] CERN
> +41 22 76 72360 CH-1211 Geneva 23
>
>
> Andreas Haupt wrote:
>> Hi Gordon,
>>
>> On Tue, 2007-07-31 at 23:46 +0100, Gordon, JC (John) wrote:
>>> Andreas, I would like to hear your ideas for how this process can be
>>> improved. Let me explain what has happened so far.
>>>
>>> Experiment readiness: At the June and July GDBs and more than one MB the
>>> LHC VOs were asked if they were ready for SL4. They all replied that
>>> they were. CMS wanted it right away and the others could use either SL3
>>> or SL4 but wanted to stop verifying their software on SL3. We saw no
>>> reason not to believe them. What are these major problems that the
>>> others now see?
>>
>> Their software does not really seem to be ready. Either it doesn't
>> compile without hooks (e.g. Atlas' user analyses but Atlas is not really
>> 64bit ready, either) whereas other things work. Another comment from
>> LHCb was their software still has problems with SL4 - sorry I don't know
>> the exact reasons. I don't understand why the experiments tell the MB
>> everything is fine and on the other hand when we as site admins ask them
>> we get a different answer.
>>
>>> Or are you talking about non-LHC experiments? If so then
>>> it is up to a site to weigh the balance of pressure from experiments and
>>> decide which opsys to support. At my site we will run SL3 and SL4 and
>>> move the resources between the two in response to experiment
>>> requirements (in one direction only).
>>
>> That's what I want to do as well. In normal cases I'd just install
>> SL4/64 as it should give the best performance. But I don't want to setup
>> lots of expensive nodes that most of the users cannot use. We are
>> providing the service for the experiments and not to pass the SFT like
>> it sounded in Steve's answer to the question why lcg-infosites is
>> missing.
>>
>>> Site Readiness: after the July GDB a number of sites tried installing
>>> the WN middleware on top of SL4. Experiments tested their code at these
>>> sites and no-one reported any problems. This was reported at the weekly
>>> operations meeting on 9 and 16 July. Again, no-one raised any
>>> objections.
>>
>> It's not the problem to install the gLite middleware on SL4. This does
>> work. But as I mentioned in my previous mail, I think it's not good
>> *how* it is done. Why do I need a castor client at my site? What do the
>> external repo server (DAG and JPackage) say about the rush of thousands
>> of WNs when doing their daily update? Or shall we mirror them locally?
>> What happens if one of the external packages in those repositories gets
>> updated and breaks dependencies with gLite? We won't be able to install
>> or update nodes any more until some new gLite version has been released
>> that can cope with it.
>>
>> Sorry, we are also caught in daily work and cannot raise all issues we
>> find at once. I don't understand why nobody in the deployment group
>> shares my concerns.
>>
>>> LHC experiments will be running large scale tests this autumn. This will
>>> not be a good time for widespread changes, hence the push for SL4 now.
>>> How would you have ensured that things go smoothly? Test at every site?
>>> Test for ever? Test at your site? It would be good if there were
>>> absolutely certain comprehensive criteria against which to test
>>> successive releases but these never appear. Realistically, the
>>> experiments don't have the effort to write their main codes, never mind
>>> keep tests up to date.
>>
>> You're right in what you say. But the main problem is not testing in
>> this case. It is how the middleware is provided. It was ok for gLite3
>> under SL3. One repository to mirror and then simply install and update
>> from it (although the quality assurance still has problems). But there
>> isn't any page I know about that states which rpms *must* be installed
>> on a worker node - a standard configuration. This must of course be
>> provided for any supported platform: SL3/32, SL4/32, SL4/64. Experiments
>> might then request additional software taking the standard configuration
>> as basis. This standard configuration must (IMHO) be installable by just
>> using the SL4 (not SLC!) repo and the gLite repo - no third party ones
>> we don't have any influence on. That's essential for providing a stable
>> service, isn't it?
>>
>> Cheers,
>> Andreas
>>
>
--
deatrich @ triumf.ca, Science/Atlas PH: +1 604-222-7665
<*> This moment's fortune cookie:
Never put off till tomorrow what you can avoid all together.
|