On Fri, 17 Nov 2006, Alastair Duncan wrote:
> On Fri, 2006-11-17 at 11:50, Antonio Delgado Peris wrote:
> > It seems that the guesses below are correct. We have seen that with a
> > low number of jobs the number of tomcat threads grows at a much slower
> > rate (or maybe it even remains stable). By now, we have deactivated the
> > new jobwrapper tests (by emptying the jobwrapper-start.d/end.d
> > directories in all the WNs). We expect that tomcat stays alive (and
> > responsive) for a longer time (~ a week) now, but clearly this is not an
> > optimal solution.
> >
> > I will submit a bug on the jobwrapper tests and a bug on tomcat memory
> > problems (although this might be redundant).
>
> We are currently looking at both of these situations. The knock on
> effect of tomcat running out of memory is that the java bug of not
> releasing connections is also encountered so this can have a detrimental
> effect on the registry which we have witnessed. [...]
I suppose it would explain these complaints from the gridftp publishers:
2006-11-17 07:41:58,118: [ERROR] Error creating primary producer.
2006-11-17 07:41:58,119: [ERROR] Unable to locate an available Registry Service
[...]
2006-11-17 21:30:58,717: [ERROR] Failed to insert tuple.
2006-11-17 21:30:58,718: [ERROR] Could not contact R-GMA server at
monb001.cern.ch:8443 - HTTP error 400 (No Host matches server name
monb001.cern.ch)
|