Hi,
Can someone do a quick permissions check on their Torque tmpdir.
Our old nodes had:
drwxr-xr-x 3 root root 4096 Jun 6 10:48 /pool
but this gives TMakeTmpDir error on new torque. We're a tad puzzled
here.
We have no prob with 777 permissions but not really an option ;-)
Peter
Peter Love ([log in to unmask]) wrote:
> Thanks for this. We installed these packages but have a problem on the
> WNs with Torque's transient directory.
>
> pbs_mom;Svr;pbs_mom;Permission denied (13) in TMakeTmpDir, Unable to make job transient directory: /pool/322326.fal-pygrid-18.lancs.ac.uk
>
> Permissions on /pool are the same as previous torque version, besides
> pbs_mom is owned by root anyway. mom config looks like this:
>
> [root@node027 root]# cat /var/spool/pbs/mom_priv/config
> $clienthost ce.lancs.pygrid
> $clienthost localhost
> $restricted ce.lancs.pygrid
> $logevent 255
> $ideal_load 1.6
> $max_load 2.1
> $tmpdir /pool
>
> Any ideas?
>
> Peter
>
>
> Alessandra Forti ([log in to unmask]) wrote:
> > Hi Peter,
> >
> > how's the upgrade is going? Mine went more or less smooth apart from few
> > dependencies problems.
> >
> > BTW I think I fixed the peaks with the upgrade to the latest torque and
> > maui and adding the parameters suggested on cluster resources The
> > problems I still got last week were due some WNs going down and not
> > marked offline (I think I'll write a cron to avoid this in the future).
> > I've also compared notes with Steve Traylen who has actually done the
> > same. Anyway for now the cluster is loaded with atlas jobs and there
> > hasn't been a peak in 12 hours and I don't have any caching mechanism. I
> > might think to use it in the future when I increase the number of nodes
> > though.
> >
> > These are the rpms you can find them in Steve area
> >
> > http://hepunx.rl.ac.uk/~traylens/rpms/
> >
> > [root@ce01 root]# rpm -qa |grep torque
> > lcg-CE_torque-3.0.1-0
> > torque-clients-2.0.0p7-1.sl3.st
> > torque-server-2.0.0p7-1.sl3.st
> > torque-devel-2.0.0p7-1.sl3.st
> > torque-resmom-2.0.0p7-1.sl3.st
> > torque-2.0.0p7-1.sl3.st
> >
> > [root@ce01 root]# rpm -qa |grep maui
> > maui-server-3.2.6p14-3_SL30X_ratio01
> > maui-3.2.6p14-3_SL30X_ratio01
> > maui-client-3.2.6p14-3_SL30X_ratio01
> >
> > I've attached my new pbs configuration for reference.
> >
> > cheers
> > alessandra
> >
> > Peter Love wrote:
> > >Thanks for this.
> > >
> > >The motivation to move to tarballs is to avoid the problem of an offline
> > >WN missing some middleware upgrades. APT can handle this but I think it'll
> > >be
> > >easier to extract a fresh tarball rather than deal with multiple rpms.
> > >Consider this an experiment...
> > >
> > >Peter
> > >
> > >
> > >BTW, no spikes for a few days.
> > >
> > >
> > >
> > >Alessandra Forti ([log in to unmask]) wrote:
> > >>Hi Peter,
> > >>
> > >>I'm upgrading the rpms and rerunning the config scripts. This avoids to
> > >>go offline. In the future I'll reinstall with kickstart only if the
> > >>upgrade is big, this time it doesn't look like that big despite the
> > >>rebranding.
> > >>
> > >>Below are the packages in Manchester, beyond 'base'. They are not really
> > >>needed (apart from yum and apt). I install them partly for
> > >>administration and partly for the burning tests which are supposed to
> > >>run every time a machine breaks and is fixed. I've never used tar balls
> > >>so I don't know anything about dependencies in that case. You might have
> > >>to add other rpms beside base that are normally handled by yum or apt.
> > >>I'm not sure why you want to use the TAR balls anyway.
> > >>
> > >>@ Editors
> > >>@ Text-based Internet
> > >>@ Administration Tools
> > >>@ System Tools
> > >>@ Development Tools
> > >>@ Kernel Development
> > >>@ OpenAFS Client
> > >>yum
> > >>apt
> > >>curl
> > >>
> > >>+ XFree86-libs XFree86-devel for atlas.
> > >>
> > >>cheers
> > >>alessandra
> > >>
> > >>
> > >>cheers
> > >>alessandra
> > >>
> > >>Peter Love wrote:
> > >>>We're doing the WN, MON, LCG-CE.
> > >>>
> > >>>BTW, can someone send their kickstart config, which specifies the base
> > >>>OS package list? This has fallen on deaf ears elsewhere.
> > >>>
> > >>>How are you all installsing WNs? Are you kickstarting or cloning of what?
> > >>>
> > >>>Peter
> > >>>
> > >>>Alessandra Forti ([log in to unmask]) wrote:
> > >>>>Hi Paul,
> > >>>>
> > >>>>there is no plan because last time I tried to setup one everybody did
> > >>>>what they wanted anyway.
> > >>>>
> > >>>>In general no Tier2 is required to install glite-CE yet, only Tier1s.
> > >>>>This will be for few months until glite-CE is proved to be stable in
> > >>>>production. Tier2s are required to upgrade LCG-CE, the WN and the MON
> > >>>>box.
> > >>>>
> > >>>>In UK Manchester and other 2 sites are guinea pigs and will upgrade
> > >>>>this week. The rest of the sites will upgrade during June at their
> > >>>>preferred pace. Since there is the Tier2 workshop in the middle latest
> > >>>>date to upgrade is middle of July.
> > >>>>
> > >>>>In NorthGrid only Lancaster and Manchester are required to upgrade ASAP
> > >>>>because of the ATLAS SC4.
> > >>>>
> > >>>>I'm on TPM shift, i.e. I'm receiving an average of a ticket every 5
> > >>>>minutes, please be patient if I don't reply all the emails.
> > >>>>
> > >>>>I've CC'd NorthGrid since it might be of interest for who wasn't at the
> > >>>>last UKI ops meeting.
> > >>>>
> > >>>>cheers
> > >>>>alessandra
> > >>>>
> > >>>>Paul Trepka wrote:
> > >>>>>Hi Alessandra,
> > >>>>>
> > >>>>> There is common plan to take action with upgrade to gLite-CE by
> > >>>>>NorthGrid ? (3.0.0) Is there raccomandation to do this into production
> > >>>>>environment ?
> > >>>>>
> > >>>>>Thanks
> > >>>>>
> > >>>>>Cheers
> > >>>>> Paul
> > >>>>>
> > >>>>>
> > >>>>--
> > >>>>*******************************************
> > >>>>* Dr Alessandra Forti *
> > >>>>* Technical Coordinator - NorthGrid Tier2 *
> > >>>>* http://www.hep.man.ac.uk/u/aforti *
> > >>>>*******************************************
> > >>--
> > >>*******************************************
> > >>* Dr Alessandra Forti *
> > >>* Technical Coordinator - NorthGrid Tier2 *
> > >>* http://www.hep.man.ac.uk/u/aforti *
> > >>*******************************************
> >
> > --
> > *******************************************
> > * Dr Alessandra Forti *
> > * Technical Coordinator - NorthGrid Tier2 *
> > * http://www.hep.man.ac.uk/u/aforti *
> > *******************************************
|