On Tue, May 23, 2006 at 05:02:21PM +0100 or thereabouts, Simon George wrote:
> Hi,
>
> this rung a bell for me so I looked up a few things on my farm.
>
> UKI-LT2-RHUL has a lot of lhcb jobs running as sgm, but not all of them. I
> have seen dozens running at once in the last few weeks/months.
> Right now we have 70 queued or running or which 24 are lhcbsgm.
This should reduce soon, at least two tier1s have recently started
strictly limiting sgm users to running exactly one job at time.
This enables them to run at the front of their VOs jobs. It is something
we would support at well.
>
> It always seems to be "/C=ES/O=DATAGRID-ES/O=UB/CN=Ricardo Graciani" who
> is mapped to lhcbsgm at the moment. Earlier in the month it was Joel
> Closier.
>
> An excerpt from the CE's gridmapfile:
> "/VO=lhcb/GROUP=/lhcb/ROLE=lcgadmin" lhcbsgm
> "/VO=lhcb/GROUP=/lhcb/ROLE=production" lhcbprd
> "/VO=lhcb/GROUP=/lhcb" .lhcb
>
> I hope this helps. It does seem different from other VOs.
>
> Cheers,
> Simon
>
> ---------------------------------------------------------------------------
> Simon George, Dept of Physics, Royal Holloway college, University of London
> Email [log in to unmask] Tel. +44 1784 41 41 85 Fax. +44 1784 472794
>
> On Tue, 23 May 2006, Gordon, JC (John) wrote:
>
> > Olivier, I am sitting next to Nick Brook and he says that lhcb
> > production jobs should not run as sgm. Is this happening at other sites?
> >
> > Can you tell me the DN of the user being mapped to sgm, if that doesn't
> > break your data security policy:-) Nick thinks the gridmapfile
> > generation may not be correct.
> >
> > John
> >
> > > -----Original Message-----
> > > From: Testbed Support for GridPP member institutes
> > > [mailto:[log in to unmask]] On Behalf Of Olivier van der Aa
> > > Sent: 23 May 2006 15:49
> > > To: [log in to unmask]
> > > Subject: shared experiment area load
> > >
> > > Dear All,
> > >
> > > At QMUL we have a load problem with the experimental shared area.
> > > The farm is running around 900 jobs and the nfs server serving the
> > > experimental area is overloaded.
> > >
> > > The result of that is that lhcb jobs sits for a long time on the wn
> > > waiting for data (mainly libraries).
> > >
> > > We would like to know how this is solved at ral, manchester where the
> > > size is similar. We where thinking of setting up a set of pbs
> > > slots for
> > > the sgm to have rw access. The other nodes would just have a
> > > copy on the
> > > local disk or access through several nfs servers.
> > >
> > > I think the problem with the small set of wn having rw access is that
> > > lhcb is sending a lot of jobs via one user who is sgm. Most of those
> > > jobs do not write to the experimental software area but they
> > > would stack
> > > to wait for the wn to be freed.
> > >
> > > We are keen to have your experience on that topic.
> > >
> > > Cheers, Olivier.
> > >
> > > --
> > > - O. van der Aa - Imperial College London -
> > > - LT2 Technical Coordinator -
> > > - tel: +442075947810, +442071005426 -
> > > - SIP: [log in to unmask] -
> > > - fax: +442078238830 -
> > > - http://surl.se/agtu -
> > >
> >
--
Steve Traylen
[log in to unmask]
http://www.gridpp.ac.uk/
|