Hi Chris, > I'm getting intermittant job aborts with this error: > > Got a job held event, reason: Globus error 94: the jobmanager does not > accept any new requests (shutting down) > > The GOC Wiki suggests that the most likely cause of this is a problem in > the batch system, either the CE cannot submit the job or fails to track > it properly. Since it is only intermittant I am guess it is not a > gerneral configuration problem. > > Looking at the batch system accounting logs I can see the jobs being > submitted fine but then something on the CE is deleteing them before > they get chance to run: The lcgpbs job manager will delete jobs reported with 'W' status. Torque will put a job into that state when the stagein failed, e.g. because there were too many concurrent ssh sessions on the CE.