That was really helpful information it is the Univ of Deleware research
project that i had heard about would be really interested in looking into
voice banking as we have been asked by a number of clients about this
Deborah Jans
Coordinator
KEYCOMM
Lothian Communication Technology Service
1c Pennywell Road
Edinburgh EH4 4PH
0131-311-7130 telephone
0131-332-6871 fax
Please note this is our new address and phone number as of 1 January 2007
> ----------
> From: A discussion list for Assistive Technology professionals. on
> behalf of S Creer
> Reply To: A discussion list for Assistive Technology professionals.
> Sent: Friday, November 9, 2007 10:12 am
> To: [log in to unmask]
> Subject: Re: Recording speech
>
> Hello,
> I am a phd student at the University of Sheffield and am currently doing
> research into finding ways for people to bank their voices and also to be
> able
> to personalise speech synthesisers once their voice has started to
> deteriorate.
>
>
> I am using the technique, like that used in speech recognition where
> people can
> adapt synthetic voices to their own using a limited amount of data, as was
> suggested in a previous reply, but it is at the moment early in its
> development
> and has not yet been made available for use publicly. I am starting to use
> it
> with standard speech and seeing if there is a possibility of using it with
> dysarthric speech and producing an acceptable output. It is currently
> still a
> research tool and is not yet suitable for use with any communication aids,
> particularly with its speed of synthesis. Hopefully though this could be
> an
> option for the future for voice banking.
>
> These people in duPont Hospital and University of Delaware have been
> looking
> into this and have some software that you can download to try and build
> your
> own synthetic voice.
> http://www.asel.udel.edu/speech/Users-participation.html
> I haven't tried this at all though and have just got this information from
> the
> internet so don't know exactly of the quality that you could end up with.
> It
> uses concatenative synthesis which is when you make a large database of
> recordings and chop these up into smaller units, such as the individual
> sounds,
> and then recombine them to make new utterances. This tool is a way of
> minimising the amount of recording necessary to build a voice by making
> recordings and ensuring that these recordings cover all the sounds in a
> language and making sure that the recordings are consistent enough with
> each
> other so that when they are concatenated together, there will be less
> distortion between the units. However, the recordings that they collect
> are
> likely to be usable ONLY with their own speech synthesiser so may not be
> the
> best way to store the voice long term.
>
> It is possible to build concatenative synthesis voices using software
> called
> Festival and Festvox from CSTR, University of Edinburgh, which is
> downloadable
> from the web but this however involves a lot of knowledge of the synthesis
> process and phonetics and is very involved and time consuming. It is
> designed
> to be used as a research tool and there is also no real guarantee that the
> output will be reasonable (stated themselves on the website). Again, this
> output will only be able to be used with their own speech synthesiser. I
> have
> built voices using this process using professional speakers with 600
> sentences
> of high quality recordings and the voices are ok but they are not very
> high
> quality consistently and really twice this amount of data is necessary for
> this
> task.
>
> It is quite a difficult task but I think the advice given in the previous
> email
> holds. I am definitely no authority on this but will pass on a few
> recommendations that I have used recording voices for my research. If you
> want
> to record your voice to bank it then to keep the options open for future
> speech
> synthesis techniques that will allow you to build your own voice:
> 1. The recordings should be as high quality as you can get - using a
> non-compressed format (i.e. not mp3 etc.).
> 2. They should be done either in one go or at the same time of day over a
> period
> of time, don't record if you have a cold etc., basically trying to keep
> the
> recording conditions as consistent as possible.
> 3. Record words and phrases that you would like to use as much as
> possible,
> including names and places that would come up frequently.
> 4. Also try and record a set of data that has a wide phonetic coverage of
> English. One suggestion is to record set A of the Arctic database which is
> around 600 sentences. This was designed to have full coverage of the
> diphones
> (a usual base of unit size for concatenative synthesis) of English
> specifically
> for a speech synthesis task. This can be found at
> http://festvox.org/cmu_arctic/
> by clicking on the cmu_data file.
>
> I hope this helps, please let me know if anything above is unclear.
>
> All the best
>
> Sarah Creer
>
> ------------------------------------------
> Clinical Applications of Speech Technology
> Departments of Computer Science and Human Communication Sciences
> University of Sheffield
> [log in to unmask]
> www.dcs.shef.ac.uk/~sarahc
>
>
|