Print

Print


Ok, i made the correction of the globus port range env variable, and  
now it works: the job is correctly submitted and running.

thanks all for your help :)

by the way Christine, I installed glite3 on the CE (lcg-CE).

cheers,
Jean

Le 22 nov. 07 à 15:36, LEROY Christine a écrit :

> Are you installing glite3.1 ?
>
>
>
> If so can you check you have these files :
>
> /opt/globus/libexec/globus-script-initializer
>   /opt/globus/libexec/globus-sh-tools-vars.sh
>    /opt/globus/lib/perl/Globus/Core/Paths.pm
>
>
>
>
>
>
> De : LHC Computer Grid - Rollout [mailto:LCG- 
> [log in to unmask]] De la part de Jean Salzemann
> Envoyé : jeudi 22 novembre 2007 15:25
> À : [log in to unmask]
> Objet : Re: [LCG-ROLLOUT] Job failing on CE with unspecified  
> gridmanager error
>
>
>
> Hi Christine,
>
>
>
> we have the correct jobmanager types in /opt/globus/etc/grid- 
> services, and we made the installation of the CE with yaim using  
> lcg-CE_torque package (with  lcgpbs jobmanager  and torque batch  
> system) on sl309. The ssh is working properly between WNs and CE, i  
> can submit a qsub job from the CE and get the output (scp works  
> without password prompt). From what i see the jobs do not enter pbs  
> queue, they seem stuck between the gatekeeper and pbs so to speak.  
> So it does not seem to be a stagein/out problem.
>
>
>
>
>
> Le 22 nov. 07 à 14:55, LEROY Christine a écrit :
>
>
>
>
> Hello Jean,
>
>
>
> What do you have in your directory :
>
> /opt/globus/etc/grid-services/
>
>
>
> (we have for exemple :
>
> # ls /opt/globus/etc/grid-services/
>
> jobmanager  jobmanager-fork  jobmanager-lcgpbs )
>
>
>
>
>
>
> Do you use pbs or lcgpbs ?
>
> Did you use yaim? Maybe you need to be carefull with the variables?
>
> If you use lcgpbs, ssh is working properly between WNs and CE?
>
> De : LHC Computer Grid - Rollout [mailto:LCG- 
> [log in to unmask]] De la part de Jean Salzemann
> Envoyé : jeudi 22 novembre 2007 14:35
> À : [log in to unmask]
> Objet : [LCG-ROLLOUT] Job failing on CE with unspecified  
> gridmanager error
>
>
>
> Dear all,
>
>
>
> We've set up a site in Vietnam, and i've experienced some behaviors  
> i've never seen when submitting jobs. The jobs are failing on the  
> CE with a dreadful "Got a job held event, reason: Unspecified  
> gridmanager error", but i can't figure out why.
>
>
>
> qsub submissions work, globus-job-run (/bin/hostname) seem to work  
> with fork (im not sure as for lcgpbs because the call prompts back  
> without any output), pbs acl seem correct.  However in /var/log/ 
> messages i have this, whenever the user is mapped on a local  
> account and the job supposed to be sent to pbs :
>
>
>
> Nov 22 19:43:03 ce gridinfo: [10770-10924] Job  
> 1195735288:lcgpbs:internal_2961450261:10714.1195735287 FAILED  
> during submission to batch system lcgpbs
>
>
>
> But i have absolutely no idea of the possible causes for this.  Any  
> idea ?  :)
>
>
>
> thanks in advance,
>
> Jean
>
>
>
>
>
>
>