Dear Maarten,
Thanks for your reply.
We are cross-checking for any kind of hindrance in our network setup (if
any).
-- Best Regards --
Adeel-ur-Rehman
-----Original Message-----
From: [log in to unmask] [mailto:[log in to unmask]]
Sent: Saturday, November 10, 2007 9:07 PM
To: Adeel-ur-Rehman
Cc: [log in to unmask]
Subject: RE: [LCG-ROLLOUT] Jobs hanging in Running state
On Sat, 10 Nov 2007, Adeel-ur-Rehman wrote:
> We have re-installed the pbs and torque rpms on our batch server and
> configured the node this time leaving the queues to its default
> configuration. But the behaviour of job execution seems to be same.
>
> An important note is that, the same job we submit (same .jdl and .sh file)
> gets sometimes stucked in the Running state while sometimes it gets
executed
> successfully.
>
> Also, sometimes jobs stucked at the start after coming into the Running
> state, while sometimes it gets stucked after spending sometime in the
> Running state.
>
> A screenshot of two jobs stucked in the running state in the start is
> attached. If we observe such a situation even after hours, it remains the
> same as far as these jobs are concerned. Others jobs may enter and execute
> successfully or they also get into the same situation.
Could you have a network hardware problem, e.g. in a department router?
Or too strict firewall settings? Note that Torque uses TCP _and_ UDP.
|