After comparing this list to my WN, there are many many rpms that are
missing from WN's. I have tracked it down to the following:
I get this error on an updaterpms in the boot sequence:
[INFO] updaterpms: Flagging gpt-2.2.9-2 for installation
[WARNING] updaterpms: installing package aliroot-3.09.06-1-1 needs 55Mb
on the / filesystem
[WARNING] updaterpms: updaterpms failed
[OK] updaterpms: started
This apparently causes all the rest of the rpms to not install. If I
comment out the
#include Alice-rpm.h
line in rpmlist/WN-rpm
Then all of the proper rpms install with the exception of the Alice ones
of course. Soooooooo, what's the proper way to take care of this? Am i
just playing with an old set of rpm.h files? (I did check out what I
think are the most recent ones...)
Thanks,
Joe
On Wed, 2003-09-17 at 11:44, Markus SCHULZ wrote:
> Hi Joe,
> I attached the list of RPMS on the WN to the first mail. Here it is again.
>
> Maybe the rpmcfg files for the WN are corrupted. You could checkout
> the rpmlist directory from CVS and check if this is what you have
> for your WNs.
>
> markus
>
> On Wed, 17 Sep 2003, Joe Kaiser wrote:
>
> > I'm using full LCFGng not the LITE distribution. I am not sure why this
> > is happening on the worker nodes, the pbs rpms don't get installed
> > either. Please send me the list and maybe I can track down what is
> > going on......
> >
> > Thanks,
> >
> > Joe
> >
> > On Wed, 2003-09-17 at 01:41, Markus SCHULZ wrote:
> > > Hi Joe,
> > > no the /opt/globus directory is not shared through NFS.
> > > In case you don't see that file you are missing at least the
> > > vdt_globus_essentials-VDTALT1.1.8-9 RPM.
> > >
> > > How did you assemble the list of RPMs that you installed on the WNs?
> > >
> > > I'll attach the list of RPMs that would be there if you would use the
> > > rpm lists that have been provided. You can compare to what you have and
> > > do some educated guesswork what's missing.
> > >
> > > markus
> > >
> > > p.s. how did you install the WNs?
> > >
> > >
> > >
> > > On Tue, 16 Sep 2003, Joe Kaiser wrote:
> > >
> > > > Oh, actually this turns out to be easy. It really isn't there because
> > > > this is running on a worker node. How do I get an /opt/globus set of
> > > > files on my worker node. Does that directory have to NFS exported or am
> > > > I missing an rpm or two?
> > > >
> > > > Thanks,
> > > >
> > > > Joe
> > > >
> > > >
> > > > On Tue, 2003-09-16 at 10:54, Daniels, T (Trevor) wrote:
> > > > > Joe
> > > > >
> > > > > OK, that moves it on a step. My job now executes but terminates with job
> > > > > status
> > > > >
> > > > > Printing status info for the Job :
> > > > > https://lxshare0380.cern.ch:9000/YNVkyqJVF_CxbEaA-V8E4w
> > > > > Current Status: Done (Cancelled)
> > > > > Exit code: 0
> > > > > Status Reason: /opt/globus/etc/globus-user-env.sh not found or
> > > > > unreadable
> > > > > Destination: hotdog46.fnal.gov:2119/jobmanager-pbs-short
> > > > > reached on: Tue Sep 16 15:51:22 2003
> > > > >
> > > > > Trevor
> > > > > .lf n25
> > > > >
> > > > > Dr Trevor Daniels
> > > > > c/o CCLRC eSC Department Phone: (+44)|(0) 1235 778093
> > > > > Rutherford Appleton Laboratory Fax: (+44)|(0) 1235 446626
> > > > > Chilton, DIDCOT, Oxon, OX11 0QX, UK Email: [log in to unmask]
> > > > > The contents of this email are sent in confidence for the use of the
> > > > > intended recipient only. If you are not one of the intended recipients do
> > > > > not take action on it or show it to anyone else, but return this email to
> > > > > the sender and delete your copy of it.
> > > > >
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Joe Kaiser [mailto:[log in to unmask]]
> > > > > > Sent: Tuesday, September 16, 2003 4:40 PM
> > > > > > To: [log in to unmask]
> > > > > > Subject: Re: [LCG-ROLLOUT] pbs issues
> > > > > >
> > > > > >
> > > > > > try jobmanager-pbs-short
> > > > > >
> > > > > > On Tue, 2003-09-16 at 10:20, Daniels, T (Trevor) wrote:
> > > > > > > Joe
> > > > > > >
> > > > > > > Just recently the authentication problem has cleared. The
> > > > > > people here have
> > > > > > > reported the response from Nikhef has been poor today,
> > > > > > presumably network
> > > > > > > problems. So I guess that was just a glich.
> > > > > > >
> > > > > > > I have now successfully submitted a job directly by
> > > > > > globus-job-run to
> > > > > > > hotdog46, and am just about to try the same via the CERN RB........
> > > > > > >
> > > > > > > It failed with
> > > > > > >
> > > > > > > Status Reason: Cannot plan (a helper failed)
> > > > > > >
> > > > > > > This usually means I've specified the wrong queue on the
> > > > > > CE. The default I
> > > > > > > use for LCG1-1_0_0 is jobmanager-lcgpbs-short - is this wrong?
> > > > > > >
> > > > > > > Trevor
> > > > > > > .lf n25
> > > > > > >
> > > > > > > Dr Trevor Daniels
> > > > > > > c/o CCLRC eSC Department Phone: (+44)|(0) 1235 778093
> > > > > > > Rutherford Appleton Laboratory Fax: (+44)|(0) 1235 446626
> > > > > > > Chilton, DIDCOT, Oxon, OX11 0QX, UK Email: [log in to unmask]
> > > > > > > The contents of this email are sent in confidence for the use of the
> > > > > > > intended recipient only. If you are not one of the
> > > > > > intended recipients do
> > > > > > > not take action on it or show it to anyone else, but return
> > > > > > this email to
> > > > > > > the sender and delete your copy of it.
> > > > > > >
> > > > > > >
> > > > > > > > -----Original Message-----
> > > > > > > > From: Joe Kaiser [mailto:[log in to unmask]]
> > > > > > > > Sent: Tuesday, September 16, 2003 3:55 PM
> > > > > > > > To: [log in to unmask]
> > > > > > > > Subject: Re: [LCG-ROLLOUT] pbs issues
> > > > > > > >
> > > > > > > >
> > > > > > > > Apparently that is because you aren't in the
> > > > > > grid-mapfile. When doing
> > > > > > > > a:
> > > > > > > >
> > > > > > > >
> > > > > > > > I get the following: Is the NIKHEF vo down?
> > > > > > > >
> > > > > > > > /opt/edg/sbin/edg-mkgridmap --output --safe
> > > > > > > >
> > > > > > > > Interrupt: Hit ENTER or type command to continue
> > > > > > > > ldap
> > > > > > > > search(ldap://grid-vo.nikhef.nl/ou=testbed1,o=alice,dc=eu-data
> > > > > > > grid,dc=org): Connection failed
> > > > > > > >
> > > > > > > > Skipping /etc/grid-security/grid-mapfile writing
> > > > > > > >
> > > > > > > > Exit with error(s) (code=64)
> > > > > > > >
> > > > > > > > shell returned 64
> > > > > > > >
> > > > > > > > My /opt/edg/etc/edg-mkgridmap.conf looks like this:
> > > > > > > >
> > > > > > > > #### GROUP: group URI [lcluser]
> > > > > > > > # LCG Standard Virtual Organizations
> > > > > > > > group
> > > > > > > > ldap://grid-vo.nikhef.nl/ou=testbed1,o=alice,dc=eu-datagrid,dc=org
> > > > > > > > .alice
> > > > > > > > group
> > > > > > > > ldap://grid-vo.nikhef.nl/ou=testbed1,o=atlas,dc=eu-datagrid,dc=org
> > > > > > > > .atlas
> > > > > > > > group
> > > > > > ldap://grid-vo.nikhef.nl/ou=tb1users,o=cms,dc=eu-datagrid,dc=org
> > > > > > > > .cms
> > > > > > > > group
> > > > > > > > ldap://grid-vo.nikhef.nl/ou=tb1users,o=lhcb,dc=eu-datagrid,dc=org
> > > > > > > > .lhcb
> > > > > > > > group ldap://lcg-vo.cern.ch/ou=lcg1,o=dteam,dc=lcg,dc=org .dteam
> > > > > > > >
> > > > > > > > #### AUTH: authorization URI
> > > > > > > > auth
> > > > > > ldap://lcg-registrar.cern.ch/ou=users,o=registrar,dc=lcg,dc=org
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On Tue, 2003-09-16 at 03:22, Daniels, T (Trevor) wrote:
> > > > > > > > > Joe
> > > > > > > > >
> > > > > > > > > I find I can't authenticate against hotdog46.fnal.gov using
> > > > > > > > my certificate
> > > > > > > > > which is registered in DTEAM, so I can't try a job submit.
> > > > > > > > Here's the
> > > > > > > > > error:
> > > > > > > > >
> > > > > > > > > GRAM Authentication test failure: authentication with the
> > > > > > > > remote server
> > > > > > > > > failed
> > > > > > > > >
> > > > > > > > > Trevor
> > > > > > > > > .lf n25
> > > > > > > > >
> > > > > > > > > Dr Trevor Daniels
> > > > > > > > > c/o CCLRC eSC Department Phone:
> > > > > > (+44)|(0) 1235 778093
> > > > > > > > > Rutherford Appleton Laboratory Fax:
> > > > > > (+44)|(0) 1235 446626
> > > > > > > > > Chilton, DIDCOT, Oxon, OX11 0QX, UK Email:
> > > > > > [log in to unmask]
> > > > > > > > > The contents of this email are sent in confidence for
> > > > > > the use of the
> > > > > > > > > intended recipient only. If you are not one of the
> > > > > > > > intended recipients do
> > > > > > > > > not take action on it or show it to anyone else, but return
> > > > > > > > this email to
> > > > > > > > > the sender and delete your copy of it.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > > -----Original Message-----
> > > > > > > > > > From: Joe Kaiser [mailto:[log in to unmask]]
> > > > > > > > > > Sent: Monday, September 15, 2003 11:06 PM
> > > > > > > > > > To: [log in to unmask]
> > > > > > > > > > Subject: [LCG-ROLLOUT] pbs issues
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Hi,
> > > > > > > > > >
> > > > > > > > > > I have decided to go for the NFS shared directories for
> > > > > > > > PBS because
> > > > > > > > > > getting PBS to work with kerberos is an undertaking I am
> > > > > > > > not prepared
> > > > > > > > > > with either time or expertise to undertake.
> > > > > > > > > >
> > > > > > > > > > LCG1 will eventuall allow for other batch systems right?
> > > > > > > > > >
> > > > > > > > > > Anyway, I need to have the home areas from the CE mounted
> > > > > > > > to the WN's
> > > > > > > > > > which is supposed to happen (as near as I can tell)
> > > > > > if you leave
> > > > > > > > > > NO_HOME_SHARED, which I have done. The directories do
> > > > > > > > not get mounted
> > > > > > > > > > however. Right now they are mounted by hand but a reboot
> > > > > > > > > > will wipe that
> > > > > > > > > > out. Can you please give me the magic recipe?
> > > > > > > > > >
> > > > > > > > > > In any event please test that you can submit jobs to fermilab
> > > > > > > > > > and let me
> > > > > > > > > > know if there are any problemsl.
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > >
> > > > > > > > > > Joe
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > >
> > > > > > > >
> > > > > > ===================================================================
> > > > > > > > > > Joe Kaiser - Systems Administrator
> > > > > > > > > >
> > > > > > > > > > Fermi Lab
> > > > > > > > > > CD/OSS-SCS Never laugh at live dragons.
> > > > > > > > > > 630-840-6444
> > > > > > > > > > [log in to unmask]
> > > > > > > > > >
> > > > > > > >
> > > > > > ===================================================================
> > > > > > > > > >
> > > > > > > > --
> > > > > > > >
> > > > > > ===================================================================
> > > > > > > > Joe Kaiser - Systems Administrator
> > > > > > > >
> > > > > > > > Fermi Lab
> > > > > > > > CD/OSS-SCS Never laugh at live dragons.
> > > > > > > > 630-840-6444
> > > > > > > > [log in to unmask]
> > > > > > > >
> > > > > > ===================================================================
> > > > > > > >
> > > > > > --
> > > > > > ===================================================================
> > > > > > Joe Kaiser - Systems Administrator
> > > > > >
> > > > > > Fermi Lab
> > > > > > CD/OSS-SCS Never laugh at live dragons.
> > > > > > 630-840-6444
> > > > > > [log in to unmask]
> > > > > > ===================================================================
> > > > > >
> > > > --
> > > > ===================================================================
> > > > Joe Kaiser - Systems Administrator
> > > >
> > > > Fermi Lab
> > > > CD/OSS-SCS Never laugh at live dragons.
> > > > 630-840-6444
> > > > [log in to unmask]
> > > > ===================================================================
> > > >
> > >
> > > --
> > > *************************************************************************
> > > * *
> > > * CERN Markus W. Schulz *
> > > * Bat. 31 2-015 *
> > > * CH-1211 Geneva 23 *
> > > * *
> > > * Phone: +41 22 76 77909 *
> > > * www.cern.ch *
> > > * *
> > > *************************************************************************
> > --
> > ===================================================================
> > Joe Kaiser - Systems Administrator
> >
> > Fermi Lab
> > CD/OSS-SCS Never laugh at live dragons.
> > 630-840-6444
> > [log in to unmask]
> > ===================================================================
> >
>
> --
> *************************************************************************
> * *
> * CERN Markus W. Schulz *
> * Bat. 31 2-015 *
> * CH-1211 Geneva 23 *
> * *
> * Phone: +41 22 76 77909 *
> * www.cern.ch *
> * *
> *************************************************************************
--
===================================================================
Joe Kaiser - Systems Administrator
Fermi Lab
CD/OSS-SCS Never laugh at live dragons.
630-840-6444
[log in to unmask]
===================================================================
|