Hi,
When running qsub multiple times manually (or when qsub is run by CEs), occasionally I get:
qsub: Invalid credential
and in the log on the batch server is this:
02/08/2012 06:36:38;0080;PBS_Server;Req;req_reject;Reject reply code=15012(PBS_Server System error: Interrupted system call MSG=error reading unmunge data), aux=0, type=AlternateUserAuthentication, from [log in to unmask]
Similarly, worker nodes also randomly have the same problem:
02/08/2012 06:35:32;0080;PBS_Server;Req;req_reject;Reject reply code=15012(PBS_Server System error: Interrupted system call MSG=error reading unmunge data), aux=0, type=AlternateUserAuthentication, from [log in to unmask]
Is this a known or expected problem with torque 2.5.7-7? It's a UMD torque server currently with 112 glite 3.2 worker nodes, all with the same version of torque and munge 0.5.8-8.el5.
I'm just using the default munge configuration. Should I try increasing the number of munge threads on the torque server, or is that not likely to be the cause of the problem?
Regards,
Andrew.