Print

Print


Dear All

Many thanks to all the system administrators whose hard work has made
the LCG2 2.4.0 upgrade our most successful. It is certainly reassuring
to see that we are making good progress in deployment and operations
despite a number of still current problems. The status map for the last
week has been quite healthy. Also the number of deployed resources has
grown well over the last few weeks so the UKI region is delivering above
expectations for this point in the EGEE project which is an excellent
achievement.

Moving forward there are a number of areas we need to tackle. For those
of you who attended the HEPSYSMAN meeting at RAL two weeks ago
(http://hepwww.rl.ac.uk/sysman/april2005/agenda.html) you will know that
we are addressing issues around the UKI ticketing system. There will be
a brief update on progress at tomorrow's UKI monthly operations meeting
(TB-SUPPORT) (http://agenda.cern.ch/fullAgenda.php?ida=a052246). We also
need to increase usage of our resources by closer monitoring of
experiment software installations and problems, as well as exploring the
support of wider VOs. With more usage we need to be sure of capturing
usage via the accounting system but as you can see here
http://goc.grid-support.ac.uk/gridsite/accounting/tree/country_view.php?
Path=1.27 not all sites are yet publishing (another topic for tomorrow
(also see note below)). It would be appreciated if ALL SITES (where
their batch system is supported by APEL) could be publishing data before
the next LCG Operations Workshop later this month (details here:
http://infnforge.cnaf.infn.it/cdsagenda//fullAgenda.php?ida=a0517). Many
thanks.

Finally and as the second part of the mail subject suggests, I would
like to bring a recent LCG-ROLLOUT mail to the attention of everyone
concerned. There are still a number of sites that have R-GMA problems.
Please could you be sure to raise any particular problems you need help
with at tomorrow's meeting.

Thanks once again to everyone for working hard to make UKI ROC and
GridPP deployment and operations a success. 
 
Kind regards,
Jeremy


-----Original Message-----
From: Laurence [mailto:[log in to unmask]] 
Sent: 10 May 2005 09:42
To: LHC Computer Grid - Rollout; [log in to unmask]
Subject: Re: [LCG-ROLLOUT] R-GMA Status


Here is todays R-GMA status.
138 Sites, 78 have R-GMA working, 29 have R-GMA Failures, 16 Have Job 
submission problems and 15 are in scheduled downtime.

On request I have added more information to make the percentages more 
understandable. Next to each region is also the number of sites in the
region as well as 
the percentage.  Job Submission problems are also show as we can't test 
R-GMA if we can't submit jobs.  For scheduled downtime I have looked for

the last time that the tests worked for that site. If the test last 
worked over 1 week ago I also class this a failure. It is interesting to

note that some sites have never passed the SFTs.


UKI  20 Sites, 75%
HP-Bristol (JS)
QMUL-eScience (10)
LivHEP-LCG2 (SD) Never worked since 08/12/2004
RHUL-LCG2 (1)
ManHEP-LCG2 (SD) Last worked 28/04/2005

(JS) Job submittion problem
(JL) Job list match problem
(SD) Scheduled Down
(1)Can connect to Tomcat
Restart Tomcat an look in the log file.
rm -f /var/tomcat4/logs/catalina.out
/etc/rc.d/init.d/tomcat4 restart
tail -f /var/tomcat4/logs/catalina.out

(2)Consumer error on the test.
(3)No C++ compiler found
(4)Does not have a MON box, java problem on WNS
(5)Memory problem, Try a reboot.
(6)Upgrade problem
http://goc.grid.sinica.edu.tw/gocwiki/Consumer__start_Timeinterval%3aHTM
L_returned_instead_of_XML 
(7)Schema Access Problem
(8)Firewall problem
(9)Site not authorized in Registry, contact Steve Traylen.
(10) RGMA_HOME is not set
(11) R-GMA not installed