Hi,
this rung a bell for me so I looked up a few things on my farm.
UKI-LT2-RHUL has a lot of lhcb jobs running as sgm, but not all of them. I
have seen dozens running at once in the last few weeks/months.
Right now we have 70 queued or running or which 24 are lhcbsgm.
It always seems to be "/C=ES/O=DATAGRID-ES/O=UB/CN=Ricardo Graciani" who
is mapped to lhcbsgm at the moment. Earlier in the month it was Joel
Closier.
An excerpt from the CE's gridmapfile:
"/VO=lhcb/GROUP=/lhcb/ROLE=lcgadmin" lhcbsgm
"/VO=lhcb/GROUP=/lhcb/ROLE=production" lhcbprd
"/VO=lhcb/GROUP=/lhcb" .lhcb
I hope this helps. It does seem different from other VOs.
Cheers,
Simon
---------------------------------------------------------------------------
Simon George, Dept of Physics, Royal Holloway college, University of London
Email [log in to unmask] Tel. +44 1784 41 41 85 Fax. +44 1784 472794
On Tue, 23 May 2006, Gordon, JC (John) wrote:
> Olivier, I am sitting next to Nick Brook and he says that lhcb
> production jobs should not run as sgm. Is this happening at other sites?
>
> Can you tell me the DN of the user being mapped to sgm, if that doesn't
> break your data security policy:-) Nick thinks the gridmapfile
> generation may not be correct.
>
> John
>
> > -----Original Message-----
> > From: Testbed Support for GridPP member institutes
> > [mailto:[log in to unmask]] On Behalf Of Olivier van der Aa
> > Sent: 23 May 2006 15:49
> > To: [log in to unmask]
> > Subject: shared experiment area load
> >
> > Dear All,
> >
> > At QMUL we have a load problem with the experimental shared area.
> > The farm is running around 900 jobs and the nfs server serving the
> > experimental area is overloaded.
> >
> > The result of that is that lhcb jobs sits for a long time on the wn
> > waiting for data (mainly libraries).
> >
> > We would like to know how this is solved at ral, manchester where the
> > size is similar. We where thinking of setting up a set of pbs
> > slots for
> > the sgm to have rw access. The other nodes would just have a
> > copy on the
> > local disk or access through several nfs servers.
> >
> > I think the problem with the small set of wn having rw access is that
> > lhcb is sending a lot of jobs via one user who is sgm. Most of those
> > jobs do not write to the experimental software area but they
> > would stack
> > to wait for the wn to be freed.
> >
> > We are keen to have your experience on that topic.
> >
> > Cheers, Olivier.
> >
> > --
> > - O. van der Aa - Imperial College London -
> > - LT2 Technical Coordinator -
> > - tel: +442075947810, +442071005426 -
> > - SIP: [log in to unmask] -
> > - fax: +442078238830 -
> > - http://surl.se/agtu -
> >
>
|