Print

Print


I have a strange problem.     The FNAL site is being thrashed by 
hundreds of copies of /tmp/grid_manager_monitor_agent that run on the 
gateway, spawned by the fork queue.     Each instance takes 14M of 
memory and before long all the system memory is used.    They are all 
from the same user, who submitted a lot of jobs a few days ago, but 
killed them with edg-job-cancel.     What is particularly strange is 
that I killed 700 of them this afternoon.    After 6 hours there were 
more than 200 running again.

At the moment I have to monitor manually.   Any thoughts of the cause 
or the solution would be appreciated.

THanks, Ian