JISCMail - PHONET Archives

I concur with Mark's opinion.  In terms of intonation, a male shift of
100 Hz to 150 Hz is much greater than a female shift of 200 Hz to 250
Hz, so it is better to show them using a logarithmic scale, or
semitones, or if you prefer in terms of percentages -- all of these are
exactly equivalent.  If you do that, a shift of 100 Hz to 150 Hz is the
same as a shift of 200 Hz to 300 Hz, and this achieves a kind of
normalization.

If, however, you are looking at simultaneous frequencies, such as
formants that combine to create a single vowel, then you should probably
use an auditory scale, such as the Bark scale or ERB-rate scale.  These
are based on factors such as masking and auditory separation of
simultaneous components of the speech signal.

So: for intonation, use a log scale; but for representing vowels, use a
Bark scale.

And when reporting values, you can probably continue to use Hertz, as
that is what everyone is familiar with.  So: in reporting values, use
Hz; but in representing values, or when analysing them, use one of the
others.

David.

(PS.  Of course, singing is on a logarithmic scale.  And harmony
probably is as well, even though it involves simultaneous sounds.  But
Jane, I'm not going to attempt to teach you about singing!  It's an
interesting thought though, if sounds that fuse into the perception of a
single vowel are represented on a Bark scale, but sounds that fuse into
a single harmony are logarithmic.  But in fact the physics of that
conclusion is fairly straightforward.)

-----Original Message-----
From: Teaching of phonetics mailing list [mailto:[log in to unmask]]
On Behalf Of Mark Huckvale
Sent: Wednesday, May 03, 2006 8:48 PM
To: [log in to unmask]
Subject: Re: Hz and Semitones

Jane

For laryngographic analysis, we always plot our F0 distributions again 
log(F0) so that it is easier to compare the ranges of speakers that have

markedly different modal F0.  So, for example, a man with a range from 
100-200Hz has the same range as a woman from 150-300Hz.  I guess you 
could call this 'normalisation of range', but it does not of course 
normalise the modal value of F0.  Semitones seem a pretty reasonable 
unit to use for measuring modal F0 and range since they are units that 
are well defined, have some kind of relationship to perception, and 
certainly in my experience lead to distributions which have a reasonably

normal shape (at least for regular voice).

Mark

Jane Setter wrote:
> Dear All
>
> I feel on relatively safe ground when it comes to looking at features 
> of speech in terms of Hz, but recently it has been recommended to us 
> that we convert the Hz into semitones, or look specifically at 
> semitones, in order to carry out normalisation of data.
>
> Can anyone advise us what is to be achieved by using semitones rather 
> than Hz, and / or suggest a means of normalising data? We are looking 
> at F0 in various populations of children (with all the problems that 
> brings), and comparing mean F0 and pitch range.
>
> Many thanks
>
> Jane Setter
> [log in to unmask]

-- 
Mark Huckvale, Director MSc Speech and Hearing Science
Phonetics and Linguistics, University College London
www.phon.ucl.ac.uk