Hi,
I've attached the log files for the two jobs:
###########
Failed job
###########
[]globus-job-run hepbf4.ph.qmul.ac.uk:2119/jobmanager-pbs /bin/hostname
GRAM Job submission failed because the job manager failed to open stdout
(error code 73)
[root@hepbf4 desy001]# cat gram_job_mgr_19394.log
7/7 13:04:06 -----------------------------------------
7/7 13:04:06 JM: Entering gram_job_manager main().
7/7 13:04:06 JM: HOME = /home/edg/desy001
7/7 13:04:06 JM: LOGNAME = desy001
7/7 13:04:06 JM: GLOBUS_ID = /O=GermanGrid/OU=DESY/CN=Jacek Nowak
7/7 13:04:06 JM: X509_CERT_DIR = /etc/grid-security/certificates
7/7 13:04:06 JM: unable to get KRB5CCNAME from the environment.
7/7 13:04:06 JM: unable to get NLSPATH from the environment.
7/7 13:04:06 JM: unable to get TZ from the environment.
7/7 13:04:06 JM: GLOBUS_DEPLOY_PATH = /opt/globus
7/7 13:04:06 JM: jobmanager_libexecdir = /opt/globus/libexec
7/7 13:04:06 JM: context loaded
7/7 13:04:06 JM: client contact = https://hepbf3.ph.qmul.ac.uk:4316/
7/7 13:04:06 JM: rsl_specification = &("rsl_substitution" =
("GLOBUSRUN_GASS_URL" "https://hepbf3.ph.qmul.ac.uk:4315" ) )("stderr" =
$("GLOBUSRUN_GASS_URL") # "/dev/stderr" )("stdout" =
$("GLOBUSRUN_GASS_URL") # "/dev/stdout" )("executable" = "/bin/hostname" )
7/7 13:04:06 JM: job status mask = 1048575
7/7 13:04:06 JM: final rsl specification >>>>
7/7 13:04:06 &("rsl_substitution" = ("GLOBUSRUN_GASS_URL"
"https://hepbf3.ph.qmul.ac.uk:4315" ) )("stderr" =
"https://hepbf3.ph.qmul.ac.uk:4315/dev/stderr" )("stdout" =
"https://hepbf3.ph.qmul.ac.uk:4315/dev/stdout" )("executable" =
"/bin/hostname" )
7/7 13:04:06 JM: <<<< final rsl specification
7/7 13:04:06 JM: staging file = /bin/hostname
7/7 13:04:06 JM: new name = /bin/hostname
7/7 13:04:06 JM: staging file = /dev/null
7/7 13:04:06 JM: new name = /dev/null
7/7 13:04:06 JM: opening stdout fd
7/7 13:04:06 JM: error opening outfile
7/7 13:04:06 JM: request failed at startup removed user proxy -->
/tmp/x509up_p19393.fileIMQYOq.0
7/7 13:04:06 JM: set JM env X509_USER_PROXY to point to
/tmp/x509up_p19393.fileIMQYOq.0
7/7 13:04:06 JM: problem reading user proxy
7/7 13:04:06 JM: request failed with error 73 (the job manager failed to
open stdout), sending message to client
7/7 13:04:06 JM: before sending to client: rc=0 (Success)
7/7 13:04:06 JM: sending to client:
HTTP/1.1 200 OK
Content-Type: application/x-globus-gram
Content-Length: 34
protocol-version: 2
status: 73
7/7 13:04:06 -------------
7/7 13:04:06 JM: major=0 minor=0
7/7 13:04:06 JM: we're done. doing cleanup
7/7 13:04:06 JM: Cleaning GASS cache
7/7 13:04:06 JM: freeing RSL.
7/7 13:04:06 JM: starting deactivate routines.
7/7 13:04:06 JM: exiting globus_gram_job_manager.
####################
The successful job:
####################
[]globus-job-run hepbf4.ph.qmul.ac.uk:2119/jobmanager-pbs /bin/hostname
hepbf5.ph.qmul.ac.uk
7/7 13:12:41 -----------------------------------------
7/7 13:12:41 JM: Entering gram_job_manager main().
7/7 13:12:41 JM: HOME = /home/edg/gridpp001
7/7 13:12:41 JM: LOGNAME = gridpp001
7/7 13:12:41 JM: GLOBUS_ID = /O=Grid/O=UKHEP/OU=ph.qmul.ac.uk/CN=D.Kant
7/7 13:12:41 JM: X509_CERT_DIR = /etc/grid-security/certificates
7/7 13:12:41 JM: unable to get KRB5CCNAME from the environment.
7/7 13:12:41 JM: unable to get NLSPATH from the environment.
7/7 13:12:41 JM: unable to get TZ from the environment.
7/7 13:12:41 JM: GLOBUS_DEPLOY_PATH = /opt/globus
7/7 13:12:41 JM: jobmanager_libexecdir = /opt/globus/libexec
7/7 13:12:41 JM: context loaded
7/7 13:12:41 JM: client contact = https://hepbf3.ph.qmul.ac.uk:4328/
7/7 13:12:41 JM: rsl_specification = &("rsl_substitution" =
("GLOBUSRUN_GASS_URL" "https://hepbf3.ph.qmul.ac.uk:4327" ) )("stderr" =
$("GLOBUSRUN_GASS_URL") # "/dev/stderr" )("stdout" =
$("GLOBUSRUN_GASS_URL") # "/dev/stdout" )("executable" = "/bin/hostname" )
7/7 13:12:41 JM: job status mask = 1048575
7/7 13:12:41 JM: final rsl specification >>>>
7/7 13:12:41 &("rsl_substitution" = ("GLOBUSRUN_GASS_URL"
"https://hepbf3.ph.qmul.ac.uk:4327" ) )("stderr" =
"https://hepbf3.ph.qmul.ac.uk:4327/dev/stderr" )("stdout" =
"https://hepbf3.ph.qmul.ac.uk:4327/dev/stdout" )("executable" =
"/bin/hostname" )
7/7 13:12:41 JM: <<<< final rsl specification
7/7 13:12:41 JM: staging file = /bin/hostname
7/7 13:12:41 JM: new name = /bin/hostname
7/7 13:12:41 JM: staging file = /dev/null
7/7 13:12:41 JM: new name = /dev/null
7/7 13:12:41 JM: opening stdout fd
7/7 13:12:41 JM: opening stderr fd
7/7 13:12:41 JM: user proxy relocation
7/7 13:12:41 JM: Relocating user proxy file to the gass cache
7/7 13:12:41 JM: Copying user proxy file from -->
/tmp/x509up_p19948.fileICamk2.0
7/7 13:12:41 JM: to -->
/home/edg/gridpp001/.globus/.gass_cache/local/md5/eb/e7/38/d5a247db8eb8dac2b314643e2c/md5/91/e3/5e/42ad26c25e265bba5d8f7e6209/data
7/7 13:12:41 JM: GSSAPI type is GSI
7/7 13:12:41 JM: set JM env X509_USER_PROXY to point to
/home/edg/gridpp001/.globus/.gass_cache/local/md5/eb/e7/38/d5a247db8eb8dac2b314643e2c/md5/91/e3/5e/42ad26c25e265bba5d8f7e6209/data
7/7 13:12:41 JMI: testing job manager scripts for type pbs exist and
permissions are ok.
7/7 13:12:41 JMI: job manager type is pbs.
7/7 13:12:41 JMI: in globus_l_gram_request_shell()
7/7 13:12:42 JMI: local stdout filename =
/home/edg/gridpp001/.globus/.gass_cache/local/md5/eb/e7/38/d5a247db8eb8dac2b314643e2c/md5/60/b9/40/a901238bc7fb878341d64a7094/data.
7/7 13:12:42 JMI: local stderr filename =
/home/edg/gridpp001/.globus/.gass_cache/local/md5/eb/e7/38/d5a247db8eb8dac2b314643e2c/md5/e5/ed/89/83f35ec8ed453a4ba8ee65195f/data.
7/7 13:12:42 JMI: cmd = /opt/globus/libexec/globus-script-pbs-submit
/tmp/grami0KK9kj
in gram_script_pbs_submit
============================================
JM_SCRIPT: ====argument file contents====
grami_logfile='/home/edg/gridpp001/gram_job_mgr_19949.log'
grami_directory='/home/edg/gridpp001'
grami_program='/bin/hostname'
grami_args=''
grami_env='"GLOBUS_GRAM_MYJOB_CONTACT"
"URLx-nexus://hepbf4.ph.qmul.ac.uk:1549/" "X509_CERT_DIR"
"/etc/grid-security/certificates" "GLOBUS_GRAM_JOB_CONTACT"
"https://hepbf4.ph.qmul.ac.uk:1548/19949/1057579961/" "GLOBUS_LOCATION"
"/opt/globus" "X509_USER_PROXY"
"/home/edg/gridpp001/.globus/.gass_cache/local/md5/eb/e7/38/d5a247db8eb8dac2b314643e2c/md5/91/e3/5e/42ad26c25e265bba5d8f7e6209/data"'
grami_count='1'
grami_stdin='/dev/null'
grami_stdout='/home/edg/gridpp001/.globus/.gass_cache/local/md5/eb/e7/38/d5a247db8eb8dac2b314643e2c/md5/60/b9/40/a901238bc7fb878341d64a7094/data'
grami_stderr='/home/edg/gridpp001/.globus/.gass_cache/local/md5/eb/e7/38/d5a247db8eb8dac2b314643e2c/md5/e5/ed/89/83f35ec8ed453a4ba8ee65195f/data'
grami_max_wall_time='0'
grami_max_cpu_time='0'
grami_max_time='0'
grami_start_time='none'
grami_min_memory='0'
grami_max_memory='0'
grami_host_count='0'
grami_job_type='2'
grami_queue=''
grami_project=''
grami_reservation_handle=''
grami_uniq_id='19949.1057579961'
JM_SCRIPT: ====argument file contents====
7/7 13:12:42 JMI: while return_buf =
testing for unsupported parameters
testing for queue attribute specification
no queue attribute specified
JM_SCRIPT: testing jobtype
testing for per process cpu time limit
No per process cpu time specified, using [queue default] per process cpu
time
testing for process wall time limit
No process wall time specified, using [queue default] process wall time
starting to build PBS job script
PBS job script successfully built
submitting PBS job script
7/7 13:12:42 JMI: while return_buf = GRAM_SCRIPT_JOB_ID:619.hepbf4
7/7 13:12:42 JMI: job id = 619.hepbf4
job submitted successfully!
returning job state: 1
7/7 13:12:42 JMI: while return_buf = GRAM_SCRIPT_JOB_ID:619.hepbf4
7/7 13:12:42 JMI: job id = 619.hepbf4
7/7 13:12:42 JMI: while return_buf = GRAM_SCRIPT_SUCCESS:1
exiting gram_script_pbs_submit\n\n
7/7 13:12:42 JMI: return_buf = GRAM_SCRIPT_SUCCESS:1
7/7 13:12:42 JMI: ret value = 1
7/7 13:12:42 JMI: returning with success
7/7 13:12:42 JM: request was successful, sending message to client
7/7 13:12:42 JM: before sending to client: rc=0 (Success)
7/7 13:12:42 JM: sending to client:
HTTP/1.1 200 OK
Content-Type: application/x-globus-gram
Content-Length: 103
protocol-version: 2
status: 0
job-manager-url: https://hepbf4.ph.qmul.ac.uk:1548/19949/1057579961/
7/7 13:12:42 -------------
7/7 13:12:42 JM: major=0 minor=0
7/7 13:12:42 JM: NOT empty client callback list.
7/7 13:12:42 JM: sending callback of status 1 (failure code 0) to
https://hepbf3.ph.qmul.ac.uk:4328/.
7/7 13:12:42 JM: poll frequency = 30
7/7 13:12:42 JM: status directory not specified, cleanup cannot proceed.
>
>
> Hi Everyone,
>
> I've been playing with the DESY VO (formerly HI VO) recently and,
> after creating appropriate user accounts in the same way as was done for
> gridpp, have encountered some errors which have not been easy to tie down.
>
> Have you ever come across this error message before?
>
> "GRAM Job submission failed because the job manager failed to open stdout
> (error code 73)"
>
> This is what the UI says when I globus-job-run to hepbf4 using a
> certificate from someone who is a member of the DESY VO.
>
> However, submitting the same job using my own certificate from
> the same machine to the same CE works with no error messages.
>
> The only obvious difference is the VO??
>
> The gatekeeper appears to authenticate the DESY user properly but
> never managers to call /opt/globus/libexec/globus-script-pbs-submit .
>
>
> Notice: 5: Authenticated globus user: /O=GermanGrid/OU=DESY/CN=Jacek Nowak
> Notice: 0: GRID_SECURITY_HTTP_BODY_FD=6
> Notice: 0: lcasmod_name = /opt/edg/lib/lcas/lcas.mod
> edg_wp4_lcas: user is /O=GermanGrid/OU=DESY/CN=Jacek Nowak
> edg_wp4_lcas: LCAS home is /opt/edg/etc/lcas
> lcas_userallow: checking allowed users in gridmapfile
> lcas_userallow: user /O=GermanGrid/OU=DESY/CN=Jacek Nowak
> edg_wp4_lcas: allowed user check passed
> lcas_userban: checking banned users in
> /opt/edg/etc/lcas/ban_users.db
> edg_wp4_lcas: ban-user check passed
> lcas_clockcheck: checking timeslots in
> /opt/edg/etc/lcas/timeslots.db
> lcas_clockcheck: Checking slot 1 out 1
> edg_wp4_lcas: wall clock check passed
> Notice: 5: Authenticated globus user is still:
> /O=GermanGrid/OU=DESY/CN=Jacek Nowak
> Notice: 5: Requested service: jobmanager-pbs
> Notice: 5: Authorized as local user: desy001
> Notice: 5: Authorized as local uid: 2501
> Notice: 5: and local gid: 2500
> Notice: 0: executing /opt/globus/libexec/globus-job-manager
> Notice: 0: GRID_SECURITY_CONTEXT_FD=11
> Notice: 0: Child 25133 started
>
>
> Cheers, Dave.
>
>
> On Thu, 3 Jul 2003, Steve Traylen wrote:
>
> > On Thu, 3 Jul 2003, Dave Kant wrote:
> >
> > > Hello,
> > >
> > > Following up on some recent work that i've been doing with h1.
> > >
> > > I noticed that the SE has configuration files under the /opt/edg/etc
> > > tree for each of the VOs e.g /opt/edg/etc/gridpp
> > >
> > > I've attempted to create such a directory for h1 by adding a few
> > > entries to the site-cfg.h.qmul file such as:
> > >
> > > #define SE_VO_H1
> > > #define SE_GDMP_REP_CAT_H1_PWD h1h1h1
> > > #define CE_IP_RUNTIMEENV10 H1
> > > #define SE_GDMP_VOS h1,gridpp,tutor,iteam,wpsix
> > > #define SE_VO_ h1:SE_GDMP_AREA/h1,...
> > >
> > > After "touch *", "mkxprof -v -A hepbf2" and a re-boot of hepbf2,
> > > the /opt/edg/etc/h1 directory did not exist.
> > >
> > >
> > > Is there something seriously wrong with what i'm doing here?
> >
> >
> > I would not say it was seriously wrong but unfortuantly life is not that
> > simple.
> >
> > You will find lots of
> >
> > #ifdef SE_VO_ALICE
> > lots of alice extras.
> >
> > #endif
> >
> > scattered through out the profile.
> >
> > Problably best if use the gridpp VO addon bits and replace
> > gridpp with hone
> >
> > I would recomend "hone" as prefix rather than h1 since pool accounts
> > and many other things may do something weird with integer in the
> > prefix.
> >
> > Steve
> >
> >
> >
> >
> >
> > >
> > > Dave.
> > >
> >
> > --
> > Steve Traylen
> > [log in to unmask]
> > http://www.gridpp.ac.uk/
> >
>
>
--
--------------------------------------------------------------
Department of Physics | Dr Dave Kant
Queen Mary College | TEL/FaX: +44 (0)20 7882 5054
Mile End Road London E1 4NS | e-mail : [log in to unmask]
--------------------------------------------------------------
|