Allan Reese (Cefas) wrote (07 February 2008) to allstat:
I've been looking at a graph in Nature, where values are "expressed as percentage departures from norm values for 1950-2000."
My question is whether you read anything into a "norm" that is different from an "average".
The dictionary to hand (Collins) gives two usages, but my own reading is that a norm has implications of being a standard, an "expectation" in the sense of desire or requirement rather than just a measure of location.
----
The reference is Nature 451, 557-560 (31 January 2008)
Large contribution of sea surface warming to recent increase in Atlantic hurricane activity
Mark A. Saunders & Adam S. Lea
Figure in question viewable at http://www.nature.com/nature/journal/v451/n7178/fig_tab/nature06422_F1.html#figure-title
----
Summary
Respondents all agreed usage was unclear or unhelpful. A clearer caption would have been "percentage differences from 1950-2000 averages" and I suspect the authors tripped themselves up because they had "normalized" each series to percentages to fit one scale. Regarding the message, I read a subtext into the title "recent increase in hurricane activity" and the use of "norm" to imply it's a system going out of control. The text notes that offshore tropical storms may have been missed in the 1950s (pre satellite surveillance), which corresponds to the lower old deviations in (a) compared to (b).
Another odd feature of this graph is the time-axis labelling. It's formally correct but extremely misleading! Labels of the form 1950-1959 usually imply data plotted by decades, but these are annual points that are running means. I would prefer an axis labelled to show the extent of data (1950-2005) with the plotted line spanning the years allocated to each running mean. This is usually the mid-point, but here I'd plot against the last year in each group as only lag effects have physical meaning. Same graph, but I wouldn't have to *deduce* from the text that they have data from 1950 not 1945. They could also clarify that these are annual data and the line doesn't have any interpolated meaning, by plotting dots and making the lines fainter. Simplifying the graph raises a question of why the authors used 1950-2000 as their "norm" rather than the mean of the entire series.
Some of you will by now have linked my interest in this graph to correspondence in RSS News about forecasting the annual level of hurricanes.
Individual responses were ...
From: [log in to unmask]:
Personally, I would associate the norm with typical or most common
values which immediately suggests the mode is the nearest measure.
From: Wells, Julian [[log in to unmask]]:
Perhaps this is an outcome of being a non-statistician by training, but
one of the things that infuriates me is the usage you draw attention to here.
If people mean "arithmetical mean" why not just say so? Doing otherwise
is either pretentious or tendentious, or both.
The prime offender is (I believe) Galton, who persuaded the world that
the Gaussian distribution was "normal", with the result that into the
20th century discussion of skewed distributions could be regarded as a
purely intellectual curiosity.
From: [log in to unmask]:
With only the information in your email, and with a sceptical view of the level
of numeracy of many scientists, I strongly suspect that you're right, and that
the author just means some sort of "average".
In principle, I would be prepared to allow "norm" to be interpreted as "normal
range of values". So, in a medical context, one might find it useful to know
how much (as a percentage) somebody's value was above the upper end of the
normal range (thought of as an upper 95% tolerance limit, say,
though that's not a term they would use).
But if the author were using that interpretation, then I would guess that several
values would be zero, lying inside the normal range - so you would have noticed.
Hence my first paragraph!
From: John Whittington [[log in to unmask]]
For the reasons you give, I think it is clear that the term should not be used in that
context, because of its ambiguity. As you say, at least one definition of the word has
some connotations which go beyond a simple measure of location.
In practice, I would imagine that (in such contexts) it is generally meant to simply
indicate 'average' - probably most commonly the mode, rather than mean, median or any
other measure of central location.
From: [log in to unmask]:
I sympathise with your puzzlement. And I can't offer any elucidations:
I'm as perplexed as you are!
My reaction is that if you can't make out from the article what
the authors intend "norm" to precisely mean (especially when
referring to a baseline for data), then there have been defects
in authoring, refereeing and editing.
Without reading the article I couldn't comment further (and
possibly not even then).
------------------------------------------------------------------------
***********************************************************************************
This email and any attachments are intended for the named recipient only. Its unauthorised use, distribution, disclosure, storage or copying is not permitted. If you have received it in error, please destroy all copies and notify the sender. In messages of a non-business nature, the views and opinions expressed are the author's own and do not necessarily reflect those of the organisation from which it is sent. All emails may be subject to monitoring.
***********************************************************************************
|