On Fri, Aug 11, 2006 at 01:55:35PM +0100 or thereabouts, Steve Thorn wrote:
> > -----Original Message-----
> > From: Testbed Support for GridPP member institutes
> > [mailto:[log in to unmask]] On Behalf Of Steve Traylen
> > Sent: 11 August 2006 13:35
> > To: [log in to unmask]
> > Subject: Re: shared experiment area load
> >
> > On Fri, Aug 11, 2006 at 11:52:05AM +0100 or thereabouts,
> > Steve Thorn wrote:
> > > Duncan
> > >
> > > I've no idea what LHCb are doing. They sent 4000 jobs to
> > Edinburgh and
> > > crashed our CE. They are also sending a number of fork jobs
> > that are
> > > using at least 50 % of the CPU.
> >
> > I doubt that "they" are sending fork jobs the RB does send
> > fork jobs as part of normall operation. One per user per RB.
> > >
> > > Steve
> > >
> We were seeing many fork jobs using considerable CPU, which seems a bit
> unusual. It's back to normal now, with CPU down by 50% or so. I thought
> the LHCb 'pilot job' mechanism bypasses the RB anyway.
No they don't and nor do the Alice ones for that matter, they both
go throgh the RB. They are fantastic RB killers though as we currently fighting
with....
You can allways locate check by either
1) issuing a qstat -f on a job and you will see that EDG_JOB_IDis set.
2) With gLite 3.X the globus-gatekeeper logs now contain the EDG_JOB_ID if
there is one. (This is vast improvment, you used to have to go ask your
freindly RB admin for the initial EDG_JOB_ID->GATEKEEPER_JM_ID mapping.
Steve
>
> Steve
>
> > > > -----Original Message-----
> > > > From: Testbed Support for GridPP member institutes
> > > > [mailto:[log in to unmask]] On Behalf Of Duncan Rand
> > > > Sent: 11 August 2006 11:28
> > > > To: [log in to unmask]
> > > > Subject: Re: shared experiment area load
> > > >
> > > > We are still getting multiple lhcbsgm jobs - does anyone
> > know what
> > > > the latest is regarding this issue?
> > > >
> > > > thanks
> > > > Duncan
> > > >
> > > > On Thu, 2006-05-25 at 10:32 +0100, Burke, S (Stephen) wrote:
> > > > > Testbed Support for GridPP member institutes
> > > > > > [mailto:[log in to unmask]] On Behalf Of Gordon, JC
> > > > > > (John)
> > > > > said:
> > > > > > Steve, is this a bug? Or just insufficiently recognised
> > > > as a feature?
> > > > >
> > > > > It's really a result of the fact that we seem to be making
> > > > a very slow
> > > > > transition to VOMS, so both site configurations and user
> > > > practice are
> > > > > inconsistent. What we want to do is move fully to VOMS,
> > > > i.e. get rid
> > > > > of LDAP servers, stop having DN mappings in the map file except
> > > > > for special cases, and have all users use voms proxies (without
> > > > > the DN mapping a non-VOMS proxy should be rejected). I'm not
> > > > > entirely sure what still needs to be done to get to that point.
> > > > > Maybe
> > > > it's something
> > > > > to raise at the GDB? Even there you still need the right
> > > > things in the
> > > > > map file to get the effect you want - the job priorities
> > > > working group
> > > > > is looking at that and seems to be making some progress,
> > > > but we need
> > > > > clear instructions for sites on what they should do.
> > > > >
> > > > > > I can raise a ticket against LHCb asking Ricardo to use a
> > > > voms proxy
> > > > > > but who else should I report it to? Atlas obviously but
> > > > what about
> > > > > > the deployment team and JRA1?
> > > > >
> > > > > One short-term option for Steve and other sites would be to
> > > > change the
> > > > > map file generation so the DN only ever gets mapped to a
> > > > > non-privileged pool account, then if an sgm forgets to use VOMS
> > > > > the job will fail - a good way to train them :)
> > Otherwise I think
> > > > > the software is mostly in place to use VOMS, it's mainly a
> > > > > deployment issue. On the middleware side I think the
> > main lack is
> > > > documentation;
> > > > > VOMS and its associated components (e.g. LCAS and LCMAPS)
> > > > are easily
> > > > > the worst-documented piece of middleware.
> > > > >
> > > > > Stephen
> > > > --
> > > > Duncan Rand, School of Engineering and Design, Brunel University,
> > > > Uxbridge, UK
> > > > Email: [log in to unmask] Tel. +44 1895 266804
> > > >
> >
> > --
> > Steve Traylen
> > work email: [log in to unmask]
> > personal email: [log in to unmask]
> > jabber: xmpp:[log in to unmask]
> >
--
Steve Traylen
work email: [log in to unmask]
personal email: [log in to unmask]
jabber: xmpp:[log in to unmask]
|