 Email discussion lists for the UK Education and Research communities  ## allstat@JISCMAIL.AC.UK

#### View:

 Message: [ First | Previous | Next | Last ] By Topic: [ First | Previous | Next | Last ] By Author: [ First | Previous | Next | Last ] Font: Proportional Font  LISTSERV Archives  ALLSTAT Home  ALLSTAT January 2008

#### Options  Subscribe or Unsubscribe   Log In   Get Password Subject: SUMMARY: A-level statistics question

From:  Date: Tue, 22 Jan 2008 15:45:34 +0000

Content-Type: text/plain

Parts/Attachments:  text/plain (886 lines)
 ```As promised, I have collated the responses to my email of last week on the Statistics A-level exam question. Further down this list is a summary followed by all the responses (so it is rather a long email). Over 20 responded on allstat, and I had verbal responses from a further 5 or so. I do not have the "model solutions", but I am sure that less than half of the respondents gave the right answer. Perversely, I think the "experts" who "knew" what frequency density" means were less likely to get the answer correct than those who started from first princples. However, there was nearly unanimous agreement that this was a poorly set question (though for a variety of reasons). I am sure the term "frequency density" did not exist when I was a student.... We had "histograms" which either had equal width classes (in which case, it was acceptable to use "frequency" on the y axis) and histograms which could have unequal classes (in which case the y axis had to be labelled "density" (always understood by reference to the units of measurement given on the x axis) so that the total area was one. Incoming A-level students educated me in the use of "frequency density" which was defined so that the total area under the histogram was equal to the sample size - again I understood this area in terms of the units given on the x-axis. My belief is that many A-level students were taught this way (and it seems to be prevalent in text books), and so could be "thrown" by this question. Note that there are no units of measurement on the x axis but if times are given to the nearest minute, does that imply that the unit of measurement is 30 seconds? However, if I accept this as a valid question then I am left with the following questions: - are "frequency density" or "density" (with no units specified) acceptable lables on a graph? - if no units are given, is there any information in the scale, or do we ALWAYS have to check that the histogram integrates to the right thing? It seems to me that if this question is valid, then frequency density is equivalent to "scaled density", as the units can be completely arbitrary. In this case, it is merely a convenient measure for integer arithmetic. I realize that I may have started another discussion here for which I may get into trouble with Allstat. I am happy to receive answers, but make no promise to summarize again! Charles Contents of the the rest ------------------------ 0. Original post 1. A summary of the answers 2. A summary of the reasons given 3. A summary of the comments Appendix A: (Almost) verbatim responses Appendix B: Previous Discussion of Frequency Density on Allstat (Oct 2007) ################################################################### 0. Original post +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ To understand this email requires reading the following link www.maths.leeds.ac.uk/~charles/S1Jan08Q3.pdf which is a scanned page from yesterday's maths A-level paper (Statistics 1). (allstat does not allow attachments). I am sending this as a follow-up to the discussion on "frequency density" which appeared a few months ago on allstat, but anyone who teaches first year students should find it of interest. If you have any comments, or would like to submit your answer, please send it to me and I will collate and circulate. However, for starters I offer the following comment from an allstat colleague:      "Placing trick questions in an exam in inexcusable, and the       examiner and scrutineer (person who checked the paper) should       be sacked." ################################################################### 1. A summary of the initial answers (freq in brackets) +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ No answer given (6) One (1) 6 (7) 12 (8) 21.8 (1) 24 (2) ################################################################### 2. A summary of the reasons given +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ (i) Some people "knew" (or looked up on the web) the definition of frequency density as given by frequency = frequency density * width (which gives 6) (ii) Some found the area under the graph to be 70 and so doubled the above answer (to get 12). Many of these people were (I think) unfamiliar with the term "frequency density", and so worked out the answer from first principles. ################################################################### 3. A summary of the comments +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Nearly all respondents thought the question was a poor one - for reasons which included: Labelling of BOTH axes The jagged x-axis (which some thought could mean missing histogram bars) Terminology of "to the nearest minute" - why state this? Why use unequal class widths for these data? Given that frequencies are used to create a histogram, this would be the natural way to retrieve them A histogram which does not integrate to 1 has a "strange measure" This style of question will put people off statistics Perverse to use 30 seconds for a UNIT of time, rather than a minute The term "frequency density" was unfamiliar to many ################################################################### Appendix A: (Almost) verbatim responses +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++  From [log in to unmask] Thu Jan 17 14:01:54 2008 Yes, I agree that =22frequency density=22 is an unusual term - should be frequency only (or probability density for a curve). Using the standard formula frequency =3D frequency/width would indeed give 6. But I would still say that one should check/work it out to see whether, given the total N of 140, the answer makes sense. Clearly the frequency presented is =C2=BD frequency (maybe this is why the term frequency density - i.e. not actual frequency just a representation of it). Silly and can=27t see the point of presenting the data this way but one should always check. It is a 5 mark question though, so perhaps those who answered 6 got =C2=BD marks=21=21=21  >>> Charles Taylor 17/01/2008 13:50 >>> Unfortunately, almost any textbook will say that a =22frequency density=22 scale is such that the area in each bar is the frequency. That is, the frequency density =3D frequency/width. Using this standard formula gives the answer 6. I have to say that most of my colleagues (all statisticians=20 by profession) have never heard of the term =22frequency density=22 - one even=20 called it an oxymoron. Those that look it up in a book give the answer 6=20 (and conclude that there is missing data cut off on the left of the plot,=20 or that the 140 runners is a mistake), those that =22work it out=22 using the=20 information given, give the answer as 12. But the real question is: =22what=20 is a frequency density=22 - we are told that times were recorded to the=20 nearest minute, so surely it is frequency per minute on the scale rather=20 than frequency per 30 seconds? An alternative suggestion is that it was a=20 three-legged race=21 On Thu, 17 Jan 2008, Helen Doll wrote:  > I have recently been helping my daughter with her maths A level S1 module (taken this week).  >  > I actually don=27t think this is a trick question at all. The answer is clearly 12. If the student answers 6 then they have not used all the information they are given and they have not shown that they understand the fundamental nature of a histogram. They are told that the total number of individuals is 140. If the frequencies are totted up (taking the smallest width as 1) then you get an answer of 70. Clearly then you need to double the answer from 6 to 12. This took me a couple of minutes to do.  >  > I would say this is an entirely fair question=21 Sorry=21  > ###################################################################  From [log in to unmask] Thu Jan 17 13:20:22 2008 Charles, I recall the discussion - mainly that it was voluminous and so do not recall if it reached a conclusion! Part of it I do recall was about the abilitiy of the statistics 'profession' to communicate informatively with the general public. In part this question reflects its inability to do that and thereby generates employment for statistics teachers and an argument as to why dealing with that subject's conventions needs to be included in a general curriculum so that all will understand! I eventually sussed the answer to be 24 reinforced by the evidence that the area under the 'curve' adds up to 70, a simple relationship to the total of 140 runners! The weak communication lies prinicipally in the labelling of the vertical axis, which I find is a fundamental conceptual weakness of the output of our current education system! Frequency density is found in textbooks because it is a generic concept and textbooks try to cover many situations. For example, in a different context, were the axis labelled 'Speed' this would be uninformative since the reader would not know whether the scale read kilometres per second or miles per hour. Given the application of this question, the vertical axis should read something like 'Arrival rate per quarter-minute'. I suggest that the level of communication is thereby so improved as to make the question almost a matter of common sense i.e. that over a particular period of 12 minutes people finished the race at the equivalent rate of 0.5 people per quarter-minute. The diagram records the differing arrival RATES across race 'duration', itself a better word in this application than 'time'. The question as written merely panders to bogus obscurity. Regards, ################################################################### From: Robert Newcombe <[log in to unmask]> Pretty appalling really. Not a trick question, simply a bad one. A good exam question should either demonstrate good practice or expect the candidate to critique bad practice. Here the candidate is expected to draw an inference from a badly-constructed diagram. Well, I suppose that's a real-life skill we all need, and which would be hard to examine in any other way - but surely this isn't what an A level exam is about! ###################################################################  From [log in to unmask] Thu Jan 17 14:00:05 2008 if 12*0.5=6runners is a wrong answer, then i blame my teacher, thats what we were taught .    height * width. ###################################################################  From [log in to unmask] Thu Jan 17 14:05:34 2008 Hi Charles and Graham, I'm not meant to respond to mailing lists, newsgroups etc. from work, so I hope you'll excuse this personal e-mail, rather than to the group (which IIRC is not intended for discussion anyway). I couldn't see where you got an answer 6. I reasoned thus: In a frequency density histogram the (frequency) count represented by a bar is the area of the bar, not it's height [i.e. area of that bar = count of runners taking {78.5:90.5}] Frequency density = count/range   bar height = count/ width   I measure the height as approximately 0.5   width = 90.5-78.5 = 2 (minutes)   area = 2 x 0.5 = 1   (only) correct answer = 1 The units of frequency density cannot be other than count/x-axis-units; the x-axis-units are clearly (albeit implicitly) minutes; the frequency density units are count/minute - I see no need to specify them. I agree that a histogram with bars of unequal width is unusual (and potentially misleading) but I don't see this question as ambiguous. It would be unfair if such unusual histograms had not been covered in the course, but (assuming they had been) I see this as a penetrating question testing the interpretation of histograms and their distinction from bar charts. Or have I got hold of totally the wrong end of the stick (again!)? Best regards, Keith Jewell mailto:[log in to unmask] telephone (direct) +44 (0)1386 842055 ------------------------------------------------------------ From: "Graham Upton" <[log in to unmask]> To: <[log in to unmask]> Subject: Re: A-level statistics question Date: 17 January 2008 11:35 I agree that this is an unfair question. It would be acceptable (I suppose) if frequency density was spelt out as individuals 0.5 minute range --- but since the times were measured to the nearest minute, this would be a very curious choice. I hope that students who reply 6 will get full marks! I was mortified to find that in my own A level books I too have written frequency density without any units --- so I cannot be too critical! g ###################################################################  From [log in to unmask] Thu Jan 17 11:59:29 2008 Hi Charles,=0D=0A=0D=0AOk was a wee bit tricky. I remembered insisting on t= he density analogy=0D=0Aof frequency charts when ones wants to get them wit= h varying category=0D=0Aranges when teaching (to undergraduate not at schoo= l). The only=0D=0Aimportant thing is that pupils/students are aware that "a= rea under the=0D=0Acurve" has to reflect frequency. And understand what it = means!=0D=0AIn courses I taught (in France), we would never have pictured f= requency=0D=0Abut instead a density (in order to make the full area under t= he curve=0D=0Aequal to 1 and not 70 like here if you assume no unit on the = Y-axis).=0D=0A=0D=0AI am curious: any other way to get "the" result (12; Ha= l 9000 of a space=0D=0Aodyssey wouldn't have liked it) than summing over th= e area to have the=0D=0Arelationship with the total population=3F=0D=0A=0D=0A= Don't worry, I am sure that examiners will have to accept many different=0D= =0Aanswers given the reasoning is (nearly) consistent! Even a disrespectful=0D= =0Aanswer like "I cannot say, unit is missing and I am too lazy to do YOUR=0D= =0Ajob" should do. But don't say it to the young person who brought you the=0D= =0Aexercice!!=0D=0A=0D=0ACheers.=0D=0A=0D=0AMatthieu ###################################################################  From [log in to unmask] Thu Jan 17 11:16:59 2008 I think sacking might not be a bad idea, actually.                                                      Nick Longford ###################################################################  From [log in to unmask] Thu Jan 17 09:49:04 2008 My answer is 12 runners!! It is a bit of a shocking question that would trick most students. ###################################################################  From [log in to unmask] Thu Jan 17 10:07:01 2008 Hello Charles At first glance I said the answer was 6 - the width of the bar from 78.5 to=   90.5 is 12, the height is 0.5, and 12x0.5=3D6. I felt that five marks was a little generous for such a small amount of wor= k, and on closer inspection realised that the area of all the bars was 70. Therefore, my answer would be '12 runners took between 78.5 and 90.5 minute= s to complete the fun run'. Personally I think this is a bit of a cheat. I had always thought that wher= e frequency density is measured on the vertical axis the area of the bar is=   equal to the frequency. I hope this is the sort of feedback you were after. Regards, Paul Newell Research Assistant - Applied Statistics University of Plymouth ###################################################################  From [log in to unmask] Thu Jan 17 09:13:03 2008 Charles, it seems to me whether this is a fair question depends on what they've been taught. Provided they have been taught this way of presenting frequency data, then the question is very straightforward. (I'm assuming that the number is 0.5 times 12, or 6 runners.) Our software for distance sampling (http://www.ruwpa.st-and.ac.uk/distance/) routinely plots histograms in this way, because we can then plot on the fitted curve, to provide a visual assessment of fit. Steve Buckland   ### thread continues ###  From [log in to unmask] Thu Jan 17 10:14:08 2008 Oops - yes, I don't see how this can come to 140 ... I get 71 too. We don't label our y-axis 'frequency density', but your textbook defn doesn't make sense to me. For the histogram shown, this would be 6, which I would take to be the frequency, not the frequency density. But suppose you plot the frequencies - assuming this is 6 for the last bar, and width 12, isn't frequency density 6 DIVIDED BY 12 = 0.5? And isn't this what they have plotted? Steve  > On a quick sample of colleagues, most have never heard of the term  > "frequency density" (there was a brief discussion on allstat in  > October), but _nearly_ all textbooks uniformly define it such that  > the frequency ###density Woops! ## is simply the area (that is width x height)  > in a histogram bar. I would have no problem with the question if  > the word "frequency" was changed to "scaled", or if the y-axis scale  > was multiplied by 2. One of the respondents said that the answer was  > 6 and noticed that the total area was 71, so they concluded that the 140  > was a simple mistake...  >  > Best wishes,  > Charles  >  >  > On Thu, 17 Jan 2008, Steve Buckland wrote:  >  >> Charles, it seems to me whether this is a fair question depends on  >> what they've been taught. Provided they have been taught this way of  >> presenting frequency data, then the question is very  >> straightforward. (I'm assuming that the number is 0.5 times 12, or 6  >> runners.) Our software for distance sampling  >> (http://www.ruwpa.st-and.ac.uk/distance/) routinely plots histograms  >> in this way, because we can then plot on the fitted curve, to provide  >> a visual assessment of fit.  >>  >> Steve Buckland ps perhaps we're intended to assume that 69 of the obsns have been truncated at the left end of the distribution - this could explain the lack of a left tail. ###################################################################  From [log in to unmask] Thu Jan 17 10:02:00 2008 Dear Charles, How are you?=20 I don't have measuring implements about my person, but my guess is that the intended answer is '12'. (Total area of the histogram blocks looks to be 70 units, so all areas need to be doubled to arrive at frequencies). The sadness is that in the context of the surreal world which is the statistics part of A-level Maths this really is not a 'trick' question for the candidates at all. When my son took statistics at A-level, I abandoned any attempt to help him with his work. Much of it was very odd when it was not wrong. Cheers, Kevin ##### thread continues ###  From [log in to unmask] Thu Jan 17 10:18:24 2008 I've seen worse. I once came across an A-level question which specified a form of probability density function, involving 2 or 3 (cannot remember the detail) unknown constants. These constants had to be determined by forcing continuity (!!!) onto the pdf as well as a requirement that it integrate to one.=20 The joke was that the resulting 'pdf' went negative !!!! Beat that !! Best, Kevin -----Original Message----- From: Charles Taylor [mailto:[log in to unmask]]=20 S Dear Kevin, I am fine thanks. I agree that your answer is probably what was intended. However, the responses I get (for the answer) are either 6 or 12. Those people who (think!) they know that "frequency density" means that the area of a bar gives the frequency (as it is defined in nearly all textbooks that use this dreadful term) have immediately responded 6 (as I did when I first did the question). Another person who then bothered to compute the total area (using this definition) got 71 and concluded that there was a simple error in that 140 should have been 71. I have wondered if this "fun run" was actually a three-legged race, but over an hour sounds a bit more like agony than fun! Best wishes, Charles ###################################################################  From [log in to unmask] Thu Jan 17 10:35:08 2008 Which exam board? Would this make a TES story? Points to add to what I wr= ote yesterday could be: * dividing the time axis into irregular intervals may be a distinguishing f= eature of histograms, but here it appears to be done only to make the quest= ion more difficult * histograms and other graphs are used to display patterns. If the expecte= d use is to look up actual values, a table should have been used. On both counts, the example suggests that graph use is being taught and exa= mined inappropriately. =20 Let me know if you would like to approach the TES yourself, jointly, or not=   at all. Any chance of getting the marking scheme from the exam board? I = could approach them as someone not involved in teaching. Allan ###################################################################  From [log in to unmask] Thu Jan 17 10:56:25 2008 I did not see the discussion, but the question seems straightforward to me. Frequency density is the number of observations per unit of the variable, the frequency density = 0.5, the interval is width 12, there are 6 runners. It is an awful histogram, which no sane person would draw, and which standard software such as SPSS or Stata would not do, but I don't see a trick. Please enlighten me. Martin #### thread continues ######  From [log in to unmask] Thu Jan 17 14:03:27 2008 I didn't look at it carefully enough, clearly. However, what other definition could there be for frequency density? So what other answer could there be to the question? I entirely agree that it is a thoroughly bad piece of educational material. I do not think we should ever use invented data. There is so much of the real thing around. Martin Charles Taylor wrote:  > The "trick" is that if you use your definition, then there are only 71  > observations (assuming that each bar must give an integer answer, so the  > frequencies are 6, 7, 8, 12, 17, 10, 5, 6) and not the stated 140.  > I not also that you did (no need to) use the information of 140.  >  > On Thu, 17 Jan 2008, Bland, M. wrote:  >  >> I did not see the discussion, but the question seems straightforward  >> to me. Frequency density is the number of observations per unit of  >> the variable, the frequency density = 0.5, the interval is width 12,  >> there are 6 runners. It is an awful histogram, which no sane person  >> would draw, and which standard software such as SPSS or Stata would  >> not do, but I don't see a trick. Please enlighten me.  >>  >> Martin  >> ###################################################################  From [log in to unmask] Wed Jan 16 21:28:05 2008 24? Good graphics are supposed to convey information with a minimum amount of decoding and effort by the viewer. This is not a good graphic. It probably doesn't help that I was one of those who was mystified by the term "frequency density". I've been a student and practitioner of statistics for some 20 years now, but I guess I won't pass my A-levels. Thanks, Scott ###################################################################  From [log in to unmask] Wed Jan 16 22:07:07 2008 As I remember it, on a density scale, the area under the histogram over an interval equals the number of cases (frequency) in that interval. The total then should be 140 but I get 71 by my estimation so the graph is incorrectly drawn. Is this a trick question? Depends on whether one feels a statistician should be able to recognize an incorrectly drawn graph. Paul Paul R. Swank, Ph.D. Professor and Director of Research Children's Learning Institute University of Texas Health Science Center - Houston ###################################################################  From [log in to unmask] Thu Jan 17 08:42:19 2008 Dear Charles, I bet you get a lot of responses to your allstat post. Anyway, I think the answer is 12. It's a horrible question: not because it's difficult, just because it's so inane. That's the kind of question that will turn a student off statistics, perhaps forever. I agree with your colleague that those responsible for the question should never be allowed to do it again. Alex -- ###################################################################  From [log in to unmask] Fri Jan 18 09:29:25 2008 Hi Charles The width of the right hand bar is 12 and its height appears to be around 0= .5 so presumably 12*0.5 =3D 6 is what the examiners were looking for. It s= eems to me unfair to use the word "calculate" when one can only estimate th= e height of the bar. Best Wishes Robin ###################################################################  From [log in to unmask] Fri Jan 18 08:36:41 2008 Dr Taylor, I do not subscribe to ALLSTAT, but came across the question in a cross-posted reply. I think the answer is ... 21.8 I was expecting an integer. ################################################################### From: "Allan Reese (Cefas)" <[log in to unmask]> Bits of my body as well as my mind are boggling. I'm glad you pointed out the columns total to 70ish. In fact, if you assume the column heights are integer or half, they total to 70 but the group 67.5-70.5 total to 16.5, so the frequency density scale is pairs of runners and the correct answer must be 12. The example appears therefore to show that "frequency density" is not an ignorant oxymoron, but is a deliberate linguistic trap used by those who would lie with statistics (being economical with the truth). I would strongly mark down that graph for the x-axis labelling as well, since the data are times to the nearest minute. Labelling the half-minutes, especially only alternates, is just perverse. The broken axis is an affectation (as are the arrowheads), as the histogram demonstrates the shape of the distribution, not the relative sizes of the x values. The y axis should be accurately titled, the units given, and scaled better to avoid 30% wasted plotting space. Placing trick questions in an exam in inexcusable, and the examiner and scrutineer (person who checked the paper) should be sacked. Will you forward this example to allstat, asking for solutions, and send the results to the exam board concerned? Add my comment above if you wish - I'll stand by it. Many thanks Allan ###################################################################  From [log in to unmask] Fri Jan 18 11:36:01 2008 Hi. I think the answer is 12. I'm not familiar with what the syllabus would expect from the term frequency density - but a student presumably would. It is a bit odd to have to add all the areas up to discover that it only comes to 70, so you need to double the area in question (but there are clues such as some sub areas totalling 4.5 and 16.5 which could alert the student that a factor of 2 is needed). Also from a pedantic point of view, the question says times were taken to the nearest minute, so the X axis scale is a bit odd. ###################################################################  From [log in to unmask] Sat Jan 19 14:35:05 2008 That=B9s an A level question? Good grief, it looks to this aged lady like a gcse intermediate question, a= t highest Or key stage 3. I find the level implied by this question far more horrifying than the poor specification of frequency density Still, they won=B9t need to take their shoes off & use their toes as a counting aid Best Diana ###################################################################  From [log in to unmask] Sat Jan 19 20:12:44 2008 In addition, I find the variable width of the bars to be rather disturbing as well. I hope people don't get the idea that this is "OK." Jay ################################################################### ################################################################### Appendix B: Previous Discussion of Frequency Density on Allstat (Oct 2007) ################################################################### From: Sandy MacRae A few weeks ago I posted the following enquiry to Allstat:  > I have been told that GCSE and A-level statistics examinations require the Y  > axis of a histogram to be labelled "Frequency density", with the appropriate  > units mentioned..  >  > When I asked for examples of this usage in published histograms displaying real  > data I was referred to textbooks for these examinations. Can anyone point me to  > examples from professional practice in any field? I had a few replies, of which some indicated familiarity with this labelling for histograms. However, none yielded any reference to a published example. If examinations demand this type of labelling, textbooks will obviously use it and it is appropriate for histograms with variable bin size or vanishingly narrow bin size. But does anyone else use it when reporting data? I don't need a full reference (though it would be useful) because I am willing to search on the basis of even a vague indication of where to look. ################################################################### From: "Allan Reese (Cefas)" Sandy MacRae suggests: If examinations demand this type of labelling, textbooks will obviously use it and it is appropriate for histograms with variable bin size or vanishingly narrow bin size. But does anyone else use it when reporting data? However, I had commented to him off-list that I would regard the label "frequency density" as an oxymoron. In terms of usage, Stata histogram command offers options: " density, fraction, frequency, and percent specify whether you want the histogram scaled to density units, fractional units, frequencies, or percentages. density is the default." The distinction I would draw is that density units imply the sum of column areas is normalized as 1, while frequency units imply the sum is the number of observations. Who examines the examiners? ################################################################### From: "R.Thomas" Surely the label 'frequency density' is formally correct? The fact that it is not instantly understandable, even to statisticians, is a problem that belongs to the statistics profession as much as to teachers of statistics. Statisticians do not take serious interest in descriptive statistics and have not developed any widely understood vocabulary in the area. If statisticians don't make themselves clear how can they expect others to do so. What is the user of official statistics, for example, to make of explanations that include phrases like "vanishingly narrow bin size"? Do they use bins in the Office for National Statistics? What is you average sixth-former supposed to make of "density units imply the sum of column areas is normalized as 1". For most of the world density does not imply the use of 1 as a denominator. Why should statisticians think differently? Is it because the statistics profession doees not recognise the concept of denominator? And "normalised"???? Is this some mysterious process? Wouldn't saying 'the vertical scale gives percentage/proportion' be more widely understood? The width of the columns is not relevant to the vertical scale and it would be appropriate that the horizontal scale specifies that column widths are proportional to numbers ################################################################### From: "Hooper, Richard" I don't think that "frequency density" is an oxymoron, since "density" just means "divided by the width of the interval". The kind of density with which statisticians are most familiar is a PROBABILITY DENSITY. A histogram can be viewed as an estimate of the probability density function - in this case the vertical axis should show the RELATIVE FREQUENCY DENSITY (relative frequency is an estimate of probability). When Stata includes the "density" option in its histogram command, this is a short-hand for relative frequency density. FREQUENCY DENSITY is another alternative that can be plotted on the vertical axis of a histogram. Frequency and relative frequency (not in density form) can only be shown unambiguously on the vertical axis if all the bins have equal width. Of course, since this is overwhelmingly the most common situation we come across, these are what we most commonly see. ################################################################### From: "R.Thomas" The problems highlighted in this discussion seems to stem from the use of the word 'bin' in place of column. The subject matter is charts. Use of the word bin seems to stem from the computerisation of charts. It is a bit of computer jargon inappropriately imported into statistics As far as charts are concerned 'bin' has no meaning that cannot be better covered by the purely descriptive word column. It makes sense to talk and write about column-width. Bin-width just illustrates that bin is not the right word. Would it make sense to say 'density means divided by the width of the bin'? Statisticians should keep in mind that they need to communicate with the public. Yes frequency density can be plotted on the vertical axis. But 'number of occurrences per ...' is much more intellible. ################################################################### From: Thomas Chu Please correct me if I am wrong. I can remember from my GCSE days that the widths of those columns in a histogram are known as 'class intervals'. Nowadays, softwares call them 'bins'. I still prefer calling them 'class intervals'.  From "A Concise Course in A level Statistics" by J Crawshaw & J Chambers: 1. In a histogram, rectangles are drawn so that the area of each rectangle is proportional to the frequency. 2. When all the 'class intervals' are of equal width, the frequency can be used for the height of each rectangle. 3. frequency density = frequency / class width ################################################################### ```