Hi,
Last week (after a bit of prompting) I hacked together a stop-gap
monitoring map, which checks accessibility through the IC and EDG RBs, as
well as via plain Globus in the way Gavin's does already. This is just
intended as an interim measure, and gives us time to sort out a better
solution whilst still having something to use in the meantime.
The map itself is at http://www.gridpp.ac.uk/map/ (I'm still adding
the names for each site and doing some tidying up, and not all sites
details are correct yet - see below.)
The notes at http://www.gridpp.ac.uk/map/notes.html explain how it
works, and I've pasted them on to the end of this email. However, to
get things going I need two things from each site:
Your current preferred Globus gatekeeper hostname (if you're running the
EDG software, this is your CE.)
Your choice of site "label" (eg RAL-PRO) if it's different to the one
I've got on the table below the map. (Sites that didn't already have
a label, currently have their .ac.uk 3rd-level domain name.)
It is also time to mail ukhepgrid with an encouraging announcement about
the Testbed and an invitation to get sites online. I'd like to do that as
soon as 1.4.4 settles down (eg that the CERN RB gets going again -
tomorrow?) and then try and get as many green stars on the new map as we
can in the next 2-3 weeks.
Cheers,
Andrew
GridPP Monitoring notes
The GridPP Monitoring map page is an extension of Gavin's Green Dot Map
system. The new system is a simple way of checking the accessibility of
GridPP sites via Globus, via the GridPP Resource Broker at Imperial and
via the EDG RB at CERN.
Jobs are submitted via the three alternative routes and if they execute
successfully, call back to the GridPP webserver via HTTP with a job ID
number. A script on the website periodically rebuilds the map and the
table below depending on the time of the last successful callback.
On the map, the colour and shape of each site's marker indicates its
status:
* Black dot - no responses from site.
* Red dot - no responses within timeout period (1 hr)
* Amber dot - Globus responses within timeout, but none via Resource
Brokers.
* Green dot - Response via GridPP RB within timeout, but none via EDG.
* Green star - Response via EDG RB within timeout; GridPP RB and
Globus status ignored.
Globus job submission is done to a fixed list of Globus Gatekeepers, one
per site. These should be the same machine as the EDG Computing
Element. If you change your CE hostname, please tell us so we can update
the list.
GridPP and EDG RB submission is done using site name Environment labels,
which each site must define for itself in its LCFG site-cfg.h file. If you
are settihng up a site, you may use the label in the table, or choose
something else. If so, please tell us or you won't go green! (If you think
it likely your site may join the Development as well as the Production
testbed - ie that you will run two sets of machines - then you may prefer
to use -PRO as a suffix, eg RAL-PRO.)
The jobs are run using Andrew's UKHEP certificate, so sites need to have
"/O=Grid/O=UKHEP/OU=hep.man.ac.uk/CN=Andrew McNab" in their
grid-mapfile. (I'm in the GridPP testbed, and the EDG iteam and wpsix VOs
so this isn't normally a problem once you have the EDG software
installed.)
To tell us about updates or problems, please mail [log in to unmask]
|