Hi,

about NFS and WN: it is possible to to make them work together, not only the /home folder, but wathever fs you want...for better performance just modify the file /var/spool/pbs/mom_priv/config (on the WN) adding

$usecp MASTER_IP:/home /home

[please note the "$"  at the beginning]

and restart the pbs_mom.

This string allows the mom to copy directly (and not via ssh or scp) files from a local (WN) point to a filesystem shared via NFS (i.e. /home). In effect this configuration is a little faster, avoiding the authentication between master and nodes

Cheers


Vega Forneris

+-----------------------------------------------+
ESA-ESRIN
Unix Systems Administrator
Via Galileo Galilei
00044 Frascati (Rm) - Italy
Phone +39 06 94180581
Mailto: [log in to unmask]
+-----------------------------------------------+
Vitrociset S.p.A.
Unix System Administrator
Via Tiburtina 1020
00100 Roma - Italy
Phone +39 06 8820 4297    
Mailto: [log in to unmask]
+-----------------------------------------------+



"Maarten Litmaath, CERN" <[log in to unmask]>
Sent by: LHC Computer Grid - Rollout <[log in to unmask]>

21/03/2005 15:16
Please respond to LHC Computer Grid - Rollout

       
        To:        [log in to unmask]
        cc:        
        Subject:        Re: [LCG-ROLLOUT] Globus-gatekeeper - GSS authentication failure



On Mon, 21 Mar 2005, Piotr Siwczak wrote:

> Hi,
>
> Thank You for hints.
>
> I found a few issues that could be sources of my troubles.
>
> I am having my /home filesystem exported to WNs by NFS. I encountered a
> few post regarding "lcgpbs" jobmanager and NFS - some say, that it's
> impossible to make them work together. BTW. The "fork" jobmanager works
> fine.
>
> Can this be the reason for my trouble?

AFAIK, the "lcgpbs" jobmanager should work fine with NFS.
It does not _need_ NFS (it was invented to allow WNs to use their own FS).

> On Fri, 18 Mar 2005, Maarten Litmaath wrote:
>
> > Piotr Siwczak wrote:
> >
> >> Hi,
> >>
> >> I can see jobs are submitted to my site, but each one quits with "aborted"
> >> state. In globus-gatekeeper logs I find:
> >>
> >>
> >> Failed reading length 0
> >> GSS authentication failure
> >>     globus_gss_assist token :3: read failure: Connection closed
> >> Failure: GSS failed Major:01090000 Minor:00000000 Token:00000003
> >>
> >> Failure: GSS failed Major:01090000 Minor:00000000 Token:00000003
> >>
> >> Can You help?
> >
> > For the ID of an aborted job run this:
> >
> >    edg-job-get-logging-info -v 1 $job_ID
> >
> > The final reason will be "retrycount hit", but look at the earlier errors.
> >
> > Next the Wiki FAQ for job submission problems may tell you what to check:
> >
> >    http://goc.grid.sinica.edu.tw/gocwiki/SiteProblemsFollowUpFaq
> >
> > BTW, I noticed your site GIIS is down:
> >
> > $ ldapsearch -x -H ldap://ce.egee.man.poznan.pl:2135 -b
> > mds-vo-name=egee.man.poznan.pl,o=grid
> > ldap_bind: Can't contact LDAP server
> >
>