Adrian Sevcenco wrote:
> offtopic of subject : what does globus-job-manager-marshal?
> at https://twiki.cern.ch/twiki/bin/view/EGEE/LcgCE is not much of help
> and in /opt/globus/var/log/globus-job-manager-marshal.log all i can see
> is a lot of "Timeout exceeded" and "WARN: Killing hung process nnnn"
The globus-job-manager-marshal, globus-gass-cache-marshal and globus-gma
daemons serve to limit the number of concurrent activities by
globus-job-manager, globus-gass-cache and "grid_monitor" processes.
Instead of doing the work itself, each such process is made to contact
its service, which will do the work on behalf of each client, allowing
for a controlled amount of parallelism and avoiding short-lived helper
processes that need to load various perl libraries (memory, CPU).
There are a few open issues for which a patch is in certification:
https://savannah.cern.ch/patch/index.php?2749
Later versions can be found here:
http://cern.ch/eticssoft/repository/org.glite/globus-gma/
http://cern.ch/eticssoft/repository/org.glite/globus-job-manager-marshal/
When there are timeouts, it usually means there is a configuration or
performance problem on the CE or the batch server.
|