Hi Yves, Sorry for bringing your name into this discussion, but thankyou for responding with all the details. I think this has provoked a unified response from the UK and something may get done about it! Cheers Pete ---------------------------------------------------------------------- Peter Gronbech Unix Systems Manager and Tel No. : 01865 273389 SouthGrid Technical Co-ordinator Fax No. : 01865 273418 Department of Particle Physics, University of Oxford, Keble Road, Oxford OX1 3RH, UK E-mail : [log in to unmask] ---------------------------------------------------------------------- -----Original Message----- From: Testbed Support for GridPP member institutes [mailto:[log in to unmask]] On Behalf Of Yves Coppens Sent: 19 May 2007 11:02 To: [log in to unmask] Subject: Update 24 Hello, From my experience of update 23 and 24, I do not recommend to anyone to upgrade their site until all the current problems have been resolved. In particular do not upgrade your DPM! However, do install the latest lcg-voms.cern.ch! If you do wish do upgrade some components, below is my experience of what works and does not and some of the changes to be aware of. The new VO style support in yaim seems to work fine. Yaim provides support for software manager and production pool accounts (e.g. atlassgm01, atlassgm02,...) Unfortunately, this new feature does not work or is fiddly, so you should keep your old users.conf and group.conf files. Depending from which version of gLite you upgrade from, you'll need to apply some of the following changes to your site-info.def: Add and set the SITE_SUPPORT_EMAIL variable. There is a new way to define queues, so you will need to add something along the lines: ALICE_GROUP_ENABLE="alice" ATLAS_GROUP_ENABLE="atlas" ... SHORT_GROUP_ENABLE="atlas alice babar biomed cms dteam hone ilc lhcb ngs ops zeus calice" if you got a queue assigned to each VO and short queue for all VOS. The variables VO_${VO}_QUEUES are not used any more - keeping them will not crash yaim (as expected). There is a new inoffensive YAIM_LOGGING_LEVEL variable. I found out that setting it to NONE was equivalent to setting it to WARNING. If you read the release notes carefully, you'll find out that there is a new way to run yaim. One can now configure a CE as follows: /opt/glite/yaim/bin/yaim -c -s /root/yaim-conf/site-info.def -n CE Once the configuration is over, yaim will wait for a CTRL-C, so you'd better not use that on workers. Shortly after running yaim on my CE as above, my gatekeeper ended in a locked state: $ service globus-gatekeeper status edg-gatekeeper dead but subsys locked $ The gatekeeper ended up in the same state after I had rerun yaim again. I've got no idea about what caused this and I had never encountered this problem in the past, so something to watch out for and to investigate. There is an ugly bug(or rather design flaw) in config_sw_dir. When yaim runs on workers, it will try to do a recursive chmod on your software area. In my case, this resulted in permission denied and error messages for every single file in my +50GB /software directory :( Now, imagine if big sites start this automatically on all workers! For more on this, see the "NOTICE - VOBOX vs. VO software area ownership" in ROLLOUT. I really think, an EGEE broadcast should have been issued and not simply an email to ROLLOUT. You can upgrade your VOBOX provided you've got the 3.0.1-15 version (from release 24 - release 23 crashed) of yaim and you take the precautions mentioned in Marteen's email. Once you do upgrade DPM, do not forget to change the port in BDII_SE_URL. THe new DPM uses a BDII rather than a GRIS. I'm aware there is a middleware certification process, some testing at CERN and limited testing on the PPS (unfortunately), but when I see the type of horrors that reaches production sites, I'm lead to say that the certification and testing is totally ineffective or highly superficial or even nonexistent. I'm a culprit for the lack of testing on the PPS, but we have very little time to perform any test at all! There are plans to do more testing in the PPS, but I still wish we would have more than one day for this, and in particular when new major functionalities are introduced, and that more time would pass before releases go from the PPS to production. Yves