I was not clear, sorry. What I meant was that, at the time of writing,
the polish site was not visible. That site has been in LCG since early
last week and most of the time it was correctly functioning. Right now
it is visible and accepts jobs. For some reason it is not appearing in
the GridIce monitoring page, though. I'll ask Piera about this when she
is back.
Cheers
Emanuele
"Daniels, T (Trevor)" wrote:
>
> Ian
>
> Emanuele told me of Poland joining only yesterday. These have to be added
> manually to the various monitors, with updates to scripts with details like
> the map coordinates if they are to show on the coloured dot maps. There
> should be a better and more reliable way of informing me of anticipated new
> sites than expecting Emanuele to remember to tell me, or for me to
> constantly poll the grid deployment web looking for changes. I can easily
> automate the extraction of info from the LDAP DB, but Poland are not in
> there yet.
>
> The list of sites polled by gppmon is shown in the table below the map.
>
> I'll add Poland today, but Emanuele said it is not visible yet.
>
> Trevor
> .lf n25
>
> Dr Trevor Daniels
> c/o CCLRC eSC Department Phone: (+44)|(0) 1235 778093
> Rutherford Appleton Laboratory Fax: (+44)|(0) 1235 446626
> Chilton, DIDCOT, Oxon, OX11 0QX, UK Email: [log in to unmask]
> The contents of this email are sent in confidence for the use of the
> intended recipient only. If you are not one of the intended recipients do
> not take action on it or show it to anyone else, but return this email to
> the sender and delete your copy of it.
>
> > -----Original Message-----
> > From: Ian Bird [mailto:[log in to unmask]]
> > Sent: Tuesday, September 30, 2003 9:19 AM
> > To: [log in to unmask]
> > Subject: Re: [LCG-ROLLOUT] GOC Monitoring
> >
> >
> > Trevor,
> >
> > You are missing Poland (Cracow) from gppmon. I assume that at the
> > moment the list of sites is hard-wired somehwere. Is there a
> > way to get
> > the list from the monitoring system?
> >
> > Ian
> >
> > > -----Original Message-----
> > > From: Daniels, T (Trevor) [mailto:[log in to unmask]]
> > > Sent: 30 September 2003 10:14
> > > To: [log in to unmask]
> > > Subject: [LCG-ROLLOUT] GOC Monitoring
> > >
> > >
> > > The gppmon map showing the results of submitting jobs to
> > > various sites via globus and via the CERN RB was unreliable
> > > at the end of last week and over the weekend, but should now
> > > be working reliably again. The problems were at this end,
> > > due to my having to move the scripts from an EDG UI to an LCG
> > > one, and my decision to re-write some of them at the same
> > > time. Apologies for the misleading information on the
> > > website during this time.
> > >
> > > This morning all monitored sites appear to be working
> > > correctly with two
> > > exceptions:
> > >
> > > The long-standing problem at RAL with all jobs still failing with:
> > >
> > > Current Status: Done (Cancelled)
> > > Exit code: 0
> > > Status Reason: Cannot read JobWrapper output, both from
> > > Condor and from
> > > Maradona.
> > > Destination:
> > > lcgce01.gridpp.rl.ac.uk:2119/jobmanager-lcgpbs-short
> > >
> > > A difficulty with submitting jobs to BNL:
> > >
> > > Globus jobs fail with:
> > >
> > > GRAM Job submission failed because the connection to the
> > > server failed (check host and port) (error co de 12)
> > >
> > > and jobs via the CERN RB fail with: (which is probably due to
> > > a job queue
> > > mismatch)
> > >
> > > Current Status: Aborted
> > > Status Reason: Cannot plan (a helper failed)
> > > reached on: Tue Sep 30 07:05:17 2003
> > >
> > > Over the last 24 hours most CE's have been responding
> > > reliably to authentication requests with 5 exceptions:
> > >
> > > grid109.kfki.hu had 6 authentication failures
> > > lhc01.sinp.msu.ru had a number of ping failures during the
> > > working day adc0015.cern.ch had 6 authentication failures (at
> > > different times to kfki) adc0018.cern.ch had 1 authentication
> > > failure hotdog46.fnal.gov was not responding until 2003-09-29
> > > 16:40 UTC but was fine thereafter
> > >
> > > Over the same time the RBs have been even more reliable with
> > > just grid111.kfki.hu failing twice to respond to a
> > > job-list-match request, and an occasional comms problem to
> > > lxshare0381.cern.ch (1), lhc20.sinp.msu.ru (2), and
> > > lxshare0380.cern.ch (1).
> > >
> > > I can provide details if sites wish to follow any these up.
> > >
> > > Trevor
> > > .lf n25
> > >
> > > Dr Trevor Daniels
> > > c/o CCLRC eSC Department Phone: (+44)|(0) 1235 778093
> > > Rutherford Appleton Laboratory Fax: (+44)|(0) 1235 446626
> > > Chilton, DIDCOT, Oxon, OX11 0QX, UK Email: [log in to unmask]
> > > The contents of this email are sent in confidence for the use
> > > of the intended recipient only. If you are not one of the
> > > intended recipients do not take action on it or show it to
> > > anyone else, but return this email to the sender and delete
> > > your copy of it.
> > >
> >
--
/------------------- Emanuele Leonardi -------------------\
| eMail: [log in to unmask] - Tel.: +41-22-7674066 |
| IT division - Bat.31 2-012 - CERN - CH-1211 Geneva 23 |
\---------------------------------------------------------/
|