Ciao Alessandro,
> just to inform you that the problem was caused by the following
> environmental variables missing [on the CE]:
>
> GLOBUS_LOCATION=/opt/globus
> GLOBUS_GMA=true
> GLOBUS_TCP_PORT_RANGE=20000,25000
>
> [...]
> >> 020 (053.000.000) 09/16 10:09:53 Detected Down Globus Resource
> >> RM-Contact: gridce.pg.infn.it:2119/jobmanager-lcgpbs
> >> ...
> >> 026 (053.000.000) 09/16 10:09:53 Detected Down Grid Resource
> >> GridResource: gt2 gridce.pg.infn.it:2119/jobmanager-lcgpbs
> >>
> >> the site-admins are sure that no firewall blocks the necessary ports,
> >> so I haven't any idea what may cause this behaviour (clocks?)
> > after a bit, in the condor log appeared also the following lines:
> >
> > ...
> > 019 (053.000.000) 09/16 10:31:14 Globus Resource Back Up
> > RM-Contact: gridce.pg.infn.it:2119/jobmanager-lcgpbs
> > ...
> > 025 (053.000.000) 09/16 10:31:15 Grid Resource Back Up
> > GridResource: gt2 gridce.pg.infn.it:2119/jobmanager-lcgpbs
> > ...
> [...]
> > ---
> > Event: Transfer
> > - Arrived = Wed Sep 16 10:47:08 2009 CEST
> > - Dest host = unavailable
> > - Dest instance =
> > /var/glite/logmonitor/CondorG.log/CondorG.1229530674.log
> > - Dest jobid = unavailable
> > - Destination = LRMS
> > - Host = gridit-cert-rb.cnaf.infn.it
> > - Reason = 8 the user cancelled the job
> > - Result = FAIL
> > - Source = LogMonitor
I have updated the GOC Wiki:
http://goc.grid.sinica.edu.tw/gocwiki/SiteProblemsFollowUpFaq
In particular:
http://goc.grid.sinica.edu.tw/gocwiki/Jobs_sent_to_some_CE_stay_in_Ready_state_forever
http://goc.grid.sinica.edu.tw/gocwiki/8_the_user_cancelled_the_job
|