Hello Maarten,
Yes, that time I was trying last effort to fix the problem.
Applying last update with yum, reconfig node - the same error in
gatekeeper log. Some time before error was the same, but there was nothing
to update, and even machine restart didn`t change something(like
firewall or other service restart).
Removing torque packages and installing again, reconfig - the same.
Nothing added to firewall rules, but restarting. - And again, I did
not see any improvements.
On WNs nothing to update, nothing obvious added to firewall rules
(there were some more common rules already).
My own job was just suspended in Running status. Than exit from machines.
So I don`t know what was the main reason.
Nice to hear that is fine. You are my lucky herald again ;)
Need I define VO_OPS_DEFAULT_SE variable to point to my DPM node(it is
the only SE at my site)? It is only for WN, not for CE?
Do I need also VO_OPS_STORAGE_DIR set? how if yes?
Thank you!
--
ML> Hi Alexander,
>> [...]
>> Got a job held event, reason: Globus error 12: the connection
>> to the server failed (check host and port)
ML> Since around 07:23 CET the job submission works again:
ML> did you restart the gatekeeper, change firewall rules, ...?
ML> Now SAM tests are failing because your WNs do not define
ML> the VO_OPS_DEFAULT_SE variable.
--
Best regards,
Alexander mailto:[log in to unmask]
|