Wonderful work Grieg,
I hope we can share this with a wider audience soon, LCG, OSG and then the world!
Regards
Owen
On Fri, 13 Oct 2006 14:09:07 +0100
Greig A Cowan <[log in to unmask]> wrote:
> Hi everyone,
>
> Following on from this weeks meeting, I've added new monitoring targets to
> the MonAMI instance that is running on my dCache head node and pool/door
> nodes.
>
> On the head node I am still monitoring the number of srmGet, Put and Copy
> requests, but am now checking that the SRM, httpd, dcap and admin
> processes are listening on the relevant ports. Have a look at the plots in
> out ganglia here (look for titles beginning dcache-*):
>
> http://mon.epcc.ed.ac.uk/ganglia/?r=hour&c=ScotGrid-Edinburgh&h=srm.epcc.ed.ac.uk
>
>
> On the pool/door nodes I am looking at the number of TCP connections that
> are in a (CLOSE_WAIT, CONNECTING, DISCONNECTING, ESTABLISHED) state as
> well as checking if the gridftp and gsidcap processes are listening on the
> relevant ports. If you want to see what it looks like, have a look for the
> plots here:
>
> http://mon.epcc.ed.ac.uk/ganglia/?r=hour&c=ScotGrid-Edinburgh&h=pool1.epcc.ed.ac.uk
>
> More work still has to be done to extend the monitoring out to more
> targets, but this is a start. I would like to use MonAMI to report the
> status of the exisiting targets to Nagios, this way alerts could be
> raised when certain thresholds are met (i.e. number of CLOSE_WAIT
> connections gets too high, due to the bug in dCache).
>
> You can contact Paul Millar or myself if you want more information. I will
> update the wiki entry with this new information.
>
> https://www.gridpp.ac.uk/wiki/MonAMI_dCache_plugin
>
> Cheers,
> Greig
>
>
>
> --
> ========================================================================
> Dr Greig A Cowan http://www.ph.ed.ac.uk/~gcowan1
> School of Physics, University of Edinburgh, James Clerk Maxwell Building
>
> TIER-2 STORAGE SUPPORT PAGES: http://wiki.gridpp.ac.uk/wiki/Grid_Storage
> ========================================================================
|