Hi,
we are testing a site gridce.pg.infn.it and the job submitted to it
remain in ready state for long time before aborting
instead the simble command works fine:
$ globus-job-run gridce.pg.infn.it/jobmanager-lcgpbs -queue cert /bin/pwd
/home/infngrid007/globus-tmp.node235.8182.0
after the job submission, on CE correctly appears the perl process
2457 8230 0.1 0.3 4548 2816 ? S 09:54 0:00 perl
/home/infngrid007/.globus/.gass_cache/local/md5/87/bc7978431a697555a6502b0d97e0f6/md5/4d/528aa9c06455c127e34d86161aa2b4/data
--dest-url=https://gridit-cert-rb.cnaf.infn.it:20001/tmp/condor_g_scratch.0x8c36900.14730/grid-monitor.gridce.pg.infn.it:2119.1/grid-monitor-job-status
2457 8233 2.4 0.9 10244 8372 ? S 09:54 0:00 \_ perl
/tmp/grid_manager_monitor_agent.infngrid007.8230.1000 --delete-self
--maxtime=3600s
but nothing else happens.
On WMS, in the Condor.G logs there are the following lines:
...
000 (053.000.000) 09/16 09:53:44 Job submitted from host:
<131.154.99.40:20335>
(https://lb009.cnaf.infn.it:9000/hWcl6dGVN7Z3JgXHbUKTkg)
(UI=000000:NS=0000000004:WM=000004:BH=0000000000:JSS=000003:LM=000000:L
RMS=000000:APP=000000:LBS=000000) (0)
...
020 (053.000.000) 09/16 10:09:53 Detected Down Globus Resource
RM-Contact: gridce.pg.infn.it:2119/jobmanager-lcgpbs
...
026 (053.000.000) 09/16 10:09:53 Detected Down Grid Resource
GridResource: gt2 gridce.pg.infn.it:2119/jobmanager-lcgpbs
the site-admins are sure that no firewall blocks the necessary ports, so
I haven't any idea what may cause this behaviour (clocks?)
Have you ever seen a similar problem?
Cheers,
Alessandro
P.S. in case you may submit jobs to this site by using
glite-rb-01.cnaf.infn.it
--
Dr. Alessandro Paolini
INFN - CNAF
Viale Berti Pichat 6/2
40127 Bologna
Italy
tel: +39 051 6092723
fax: +39 051 6092916
ICQ: 192172027
skype: alex.paolini
**********************
"credo nel potere del riso e delle lacrime"
"come antidoto all'odio ed al terrore"
"un giorno senza un sorriso"
"è un giorno perso" >>> Charlie Chaplin
|