On Wed, 8 Dec 2004 16:54:03 +0000, Adrian Midgley <[log in to unmask]>
wrote:
>There is a simple point here, with an odd effort to submerge it coming in
>again from Laurie. I'll state it here and then go into some detail below
>
>A machine that is restarted every two days is a machine whose owner
doesn't
>believe can run for 4 days.
>
>Some of us think that well-engineered systems run longer than 4 days.
Yes Midge, we got your message (and your opinion) the first time. But
repeatedly stating it does not make it right, nor make us any more likely
to accept it. Terry and I have argued as to why your statement is not held
by everyone. I would prefer to go along with the opinion of an IT
professional who is paid to do his job and therefore paid to get it right.
You have every right to disagree, but no right to expect everyone else to
agree.
Laurie Miles
(with another reply buried below in the message)
>
>On Wednesday 08 December 2004 15:46, Laurie Miles wrote:
>
>> Memory leaks are a fact of life in *every* operating system - the
evidence
>> is out there if you look for it.
>Didn't need to, learned that years ago.
>That would be why every operating system I have has a way of observing
memory
>use...
>Going a bit up market, most have a heap-walker, a way of looking for
>fragmentation in the "heap" of memory that is caused by allocated (malloc
()
>or whatever) memory that an application does not release. But that is a
bit
>technical.
>
>> Accumulated memory leaks need the PC rebooting.
>Not just PCs. Although mainframes (operating systems from the 1970s and
>onward) seem to have different approaches to managing such things.
>
>> So all PCs running whatever operating system will require
>> rebooting every so often to stop problems occurring.
>
>Actually, they will _require_ rebooting when the unreleased memory
reaches a
>level that risks impairing their working.
They require rebooting well before this if you don't want to have an
unstable system (unless you want to boast about how long your system has
been "up" to other geeks :-))
>So a good approach to it is to observe, measure, and reboot when
necessary.
>
>An arbitrary rebooting per time period is another approach, but the
question
>of how to decide the time period then gets interesting.
>
>One of the benefits of COTS, and more widely of standard setups, such as
LAMP
>or presumably Exchange on NT, is that there will be either a manufacturer
>running an instance to predict problems that are generic -in the same way
>that commercial aircraft have an instance that has always had more bending
>and banging than any that are actually flying with passengers in, and if
its
>tail falls off all the others get checked/grounded until the problem is
>regarded as contianed.
>
>For a server which had been observed to run for 327 days satisfactorily,
it
>would seem perverse to kick it over every second day.
>Running it to 326 days would argue a great faith in precision of
measurement.
>
>Do I hear an objection to 6 months?
Why is this arbitrary figure any better than 2 days? You have given no
indication of why we should accept this figure.
At the end of the day we have an IT professional charged with maintaining
an important server suggesting that he prefers to reboot the dual server
alternately every 2 days. And we have a GP with an interest in IT
maintaining that servers don't need to be rebooted. Who would you believe?
This discussion has convinced me that a scheduled reboot may be a good
idea for our practice server, probably on a monthly interval after a full
backup.
>There are problems with isolated singular instances, and no two big
databases
>are alike, but if there is a big difference between appraisals.nhs.uk (or
the
>separate machine running MS SQL Server, which I think also gets kicked
over
>regualrly by the clock and is the actual databse) and anyone else's
>web-serving database then it is going to be due to application code in
>whatever is used to automate SQL Server and IIS to produce web pages.
>
>ASP perhaps, or VBA.
>
>
>> Midge - prophylactic
>> action to prevent problems is surely better than boasting "my server has
>> more up time than yours" and waiting for it to fail. Your beloved UNIX
has
>> OS memory leaks, and your server will definitely run better if rebooted
>> every so often. How often I cannot say - that would require an expert
>> opinion or software monitoring it (the first hit in Google I came up
with
>> was a software firm selling a program to monitor memory leaks in UNIX).
>>
>> Laurie Miles
>>
>> On Wed, 8 Dec 2004 15:26:02 +0000, Adrian Midgley <[log in to unmask]>
>>
>> wrote:
>> >On Wednesday 08 December 2004 14:42, Terry Brown wrote:
>> >
>> > Why not? :)
>> >
>> >The assumption most of us would make is that a device that is routinely
>> >restarted after 2 days is incapable of reliable operation beyond 4
days.
>> >
>> >Consider aeroplanes for instance.
>> >
>> >> I'm sure the behemoths of the world of open source will tout uptime
as a
>> >> fundamental measure of greatness, and it's certainly a nice
statistic to
>> >> look at, but we have a maintenance schedule that doesn't deliver any
>> >> downtime to the user and ensures that should there have any software
on
>> >> there that in any way degrades the performance of the machine (and
lets
>> >> face it, even non MS software companies can release software with
>> >> bugs/memory leaks in it).
>> >
>> >So, having rebooted frequently, you'd be able to say that _in the
range of
>> >operating conditions so far encountered_, nothing seems to leak so much
>>
>> as to
>>
>> >impair the system in two days.
>> >
>> >A memory leak (I don't know a lot about them, but I have met them)
would
>>
>> be
>>
>> >either from one application, or from the operating system and would
tend
>>
>> to
>>
>> >stack up unreleased memory as a result of _particular operations_.
>> >
>> >THose operations may occur infrequently, or frequently, and may be
>>
>> variable
>>
>> >according to external events whcih the operator of the machine is not
in
>> >control of (nothing sinister, just when a particular thing is asked
for a
>> >megabyte of memory goes AWOL, but that only happens when someone gets
>> >interested in I don't know, controlled drug registers - which happens
>>
>> perhaps
>>
>> >shortly before the end of the year.
>> >
>> >If the leak is in an application, restarting the application should
>>
>> release
>>
>> >all memory and restore it to (I love that word) "freshness". Quickly.
>> >
>> >So we are looking at leaks in the (bundle of stuff (IE, admin etc
>> >etc)presented as being the) OS here, if we are serious.
>> >
>> >SO the statement being made seems to me to be that the operator of a
>> >significant bit of kit does not trust the operating system to run
>> >continuously for 4 days.
>> >
>> >What bothers me (not a lot personally, just in theory) is that if
memory
>>
>> leaks
>>
>> >are large but rare then this policy does absolutely nothing to prevent
>>
>> them
>>
>> >overwhelming the machine, while IMHO adding to a mindset that reckons
as
>>
>> long
>>
>> >as the aeroplane's engine will run for long enough to use all the fuel
you
>> >can get in the tanks, there is no need to try to make it more precise,
>>
>> tough,
>>
>> >reliable, powerful, efficient or otherwise better.
>> >
>> >I suspect that Debian Woody does not leak memory.
>> >I also suspect that Emacs doesn't.
>> >And in neither case is it because the maintainers were happy to turn it
>>
>> off
>>
>> >every 48 hours.
>> >
>> >I suspect that Opera _does_ leak memory under some conditions - i
notice
>>
>> that
>>
>> >after a couple of weeks of use it seems to be using more memory than I
>> >expect, and I restart it. (On Linux, Opera is currently my favourite
>> >browser). I expect that successive versions will have old leaks fixed.
>> >
>> >> The precautionary measure isn't in place because of Microsoft, it's
>> >
>> >Oh...
>> >
>> >> there to ensure that the machine is as fresh as possible as often as
>> >> possible, and we'd do the same if the system was running under
Solaris
>> >> or a flavour of Linux.
>> >
>> >_You_ would...
>> >
>> >> Service delivery isn't just about uptime, we
>> >> like our users to be able to actually use the software when they're
>> >> there, and if the software is running sluggishly for whatever reason,
>> >> then they're not getting what we claim to be delivering.
>> >
>> >I suppose the other approach would be to pick up when software runs
>> >sluggishly, and fix it/have it fixed.
>> >After a while one would be moving on to weekly and then fortnightly
>>
>> reboots,
>>
>> >then monthly, seasonal and annual.
>> >
>> >And I think it would not be hard to represent that as being progress,
>> >improvement in reliability and that elusive "quality" and all brought
>>
>> about
>>
>> >through the (moderate) efforts of the operators.
>> >
>> >> I'm sure the retaliation could be that this is a Microsoft problem,
and
>> >> that no other software has memory leaks / bugs / errors on launch,
but
>> >> it's simply not the case. Regular rebooting of a machine is
something
>> >> that I've always felt was a useful thing to do (providing it doesn't
>> >> deter from service delivery), and have done it since my days
>> >> administering Solaris 2.4 machines.
>> >
>> >--
>> >Dr Adrian Midgley GP Exeter www.defoam.net
>> >Open Source is a necessary but not of itself sufficient condition.
>
>--
>Dr Adrian Midgley GP Exeter www.defoam.net
>Open Source is a necessary but not of itself sufficient condition.
|