Print

Print


On Tue, 1 Feb 2005, Sotomayor, Maniel wrote:

> > You do not need to keep the statistics.  If you simply want to know how many
> > jobs were handled by the RB, you can run this command:
> >
> > /opt/condor/bin/condor_history | awk 'n < 0 + $1 { n =3D 0 + $1 } END { print n }'
> >
> >> backup the stats for later inserting them into the new installation?
> >
> > Depends on what you want to keep exactly.  Probably the only useful data
> > would sit in the MySQL database, so you would stop the edg-wl-* services
> > and mysql, tar up /var/lib/mysql and save it somewhere; on the reinstalled RB
> > you would unpack the tar ball instead of doing the usual MySQL initialization
> > (assuming the databases on RH73 and SL3 are binary compatible).
>
> After backing up the statistics, how can i do to a graceful shutdown of
> the rb ?
> Is it safe to stop the wl services without affecting the execution of
> current user jobs ?

Stopping the RB services does not affect jobs in steady state;
only jobs in transit (e.g. just finishing) will be lost.

When you want to allow running jobs to finish OK after the upgrade,
you must keep a lot more than just the MySQL database; the job state
information is scattered over these places:

    /etc/grid-security/gridmapdir
    /var/tmp
    /var/lib/mysql
    /var/edgwl
    /opt/edg/var/spool/edg-wl-renewd
    /opt/edg/var/log
    /opt/condor/var/condor/spool
    /opt/condor/var/condor/log

After shutting down the edg-wl-* services and the mysql daemon,
one would tar up all those directories and save the tar ball on
another node.  The RB can then be reinstalled from scratch,
and the tar ball can be unpacked on it afterwards.

Of course, if there is no need to scratch the node, there is no need
to make the tar ball either: you would do the upgrade and simply
restart the services.

For your case I suggest running the condor_history command as shown above,
to find out how popular your RB has been: if it was only used for tests,
do not bother, just scratch it.