Hi Cal
A replacement script can be found at http://hepunx.rl.ac.uk/egee/jra1-uk/LCG/edg-rgma-restart-all
We have passed a new set of rpms to LCG which we hope will be released at the end of the month
Regards
Antony
> -----Original Message-----
> From: LHC Computer Grid - Rollout
> [mailto:[log in to unmask]] On Behalf Of Charles Loomis
> Sent: 14 March 2005 06:41
> To: [log in to unmask]
> Subject: Re: [LCG-ROLLOUT] rgma going mad on 2.3.1
>
>
> Hi Jeff,
>
> This is indirectly caused by the rgma-servlet-monitor cron
> entry which tries to restart the rgma daemons (and hence
> tomcat) every so often. Unfortunately, the tomcat4 shutdown
> script tries to exit gracefully and never succeeds in
> stopping the processes. This eventually exhausts the system
> memory, ....
>
> Eric Fede had found this sometime back and I believe
> submitted a bug report for it. (He can confirm this.) In
> the meantime, in /etc/init.d/tomcat4 you can add the hack
> below to the stop method to ensure that the shutdown actually happens:
>
> stop() {
> echo -n "Stopping $TOMCAT_PROG: "
>
> if [ -f /var/lock/subsys/tomcat4 ] ; then
> if [ -x /etc/rc.d/init.d/functions ]; then
> daemon --user $TOMCAT_USER $TOMCAT_SCRIPT stop
> else
> su - $TOMCAT_USER -c "$TOMCAT_SCRIPT stop"
> fi
> RETVAL=$?
>
> # Hack to ensure that processes really die.
> sleep 15
> killall -u=tomcat4 java
> sleep 5
>
> tc4run=1
> until [ $tc4run = '0' ]
> do
> tc4run=`ps -aux | grep catalina | grep -v grep |
> grep $TOMCAT_USER -c\`
> sleep 1
> done
> rm -f /var/lock/subsys/tomcat4 /var/run/tomcat4.pid
> fi
>
> echo
>
> [ $RETVAL = 0 ]
>
> }
>
> Perhaps this has been fixed in some official way, but I've
> failed to notice. If so, I'd appreciate a pointer to the
> official fix.
>
> Cheers.
>
> Cal
>
>
>
> Jeff Templon wrote:
> > Hi,
> >
> > we've seen our R-GMA service go nuts a few times since upgrading to
> > 2.3.1. I just now caught it in the act: load on the
> machine was about
> > 150. Checking, there were an awful lot of 'ps' processes floating
> > around. Here is a snippet of the output of pstree:
> >
> > java
> > crond86*[crondshshedg-rgm+
> > 4*[crondshshedg-rgma+
> > 2*[crondshshedg-rgma+
> > 5*[crondshshedg-rgma+
> > 6*[crondshshedg-rgma+
> > 2*[crondshshedg-rgma+
> > 4*[crondshshedg-rgma+
> > crondshshedg-rgma-se+
> > crondshshedg-rgma-se+
> > 5*[crondshshedg-rgma+
> > crondshshedg-rgma-se+
> > 2*[crondshshedg-rgma+
> > crondshshedg-rgma-se+
> > crondshshedg-rgma-se+
> > crondshshedg-rgma-se+
> > crondshshedg-rgma-se+
> >
> >
> > looks like the culprits are:
> >
> > - edg-rgma-restart
> > - edg-rmga-service-status
> > - edg-rgma-servlet-status
> > - edg-rgma-servlet-monitor
> >
> > and there are hundreds of processes running that are all trying to
> > stop tomcat4 ...
> >
> > anybody else seeing this??
> >
> > JT
> >
>
|