Print

Print


Hello,

I am a phd student at the University of Sheffield and am currently doing
research into finding ways for people to bank their voices and also to be able
to personalise speech synthesisers once their voice has started to deteriorate.


I am using the technique, like that used in speech recognition where people can
adapt synthetic voices to their own using a limited amount of data, but it is
at the moment early in its development and has not yet been made available for
use publicly. It is currently still a research tool and therefore requires
specialised software and some specialised knowledge. The voices built are
unfortunately not yet suitable for use with any communication aids,
particularly due to the speed at which the synthesis is generated.

In a few years time though, hopefully using this type of synthesis will be
possible. It requires much less data than other systems and it can also
compensate for certain elements of the speech having started to deteriorate.
There are a few other options at this point which have been discussed but
(although I am no expert on this) it is worth banking the voice so that it
could be used with this technique and any other technology that emerges in the
future. I have included a few suggestions on how to do this further on in the
email.

These people in duPont Hospital and University of Delaware have been looking
into voice banking and have some software that you can download to build your
own synthetic voice.
http://www.modeltalker.com/
It uses concatenative synthesis which is when you make a large database of
recordings and chop these up into smaller units, such as the individual sounds,
and then recombine them to make new utterances. This tool is a way of
minimising the amount of recording necessary to build a voice by making
recordings and ensuring that these recordings cover all the sounds in a
language and making sure that the recordings are consistent enough with each
other so that when they are concatenated together, there will be less
distortion between the units. However, the recordings that they collect are
likely to be usable ONLY with their own speech synthesiser so may not be the
best way to store the voice long term.

I am in the process of doing the recordings for this software although I was
warned that it is designed for US English so the output may not be as high
quality as it could be. As soon as I have finished this and have an output I
will post some examples to the list.

It is possible to build concatenative synthesis voices using software called
Festival and Festvox from CSTR, University of Edinburgh, which is downloadable
from the web but this however involves a lot of knowledge of the synthesis
process and phonetics and is very involved and time consuming. It is designed
to be used as a research tool and there is also no real guarantee that the
output will be reasonable (stated themselves on the website). Again, this
output will only be able to be used with their own speech synthesiser. I have
built voices using this process using professional speakers with 600 sentences
of high quality recordings and the voices are ok but they are not very high
quality consistently and really twice this amount of data is necessary for this
task. This is the engine that Cereproc uses.

I am definitely no authority on voice banking but will pass on a few
recommendations that I have used recording voices for my research. If you want
to record your voice to bank it then to keep the options open for future speech
synthesis techniques that will allow you to build your own voice:
1. The recordings should be as high quality as you can get - using a
non-compressed format, WAV files are probably best (i.e. not mp3 etc.). I have
built voices using recordings done on a home computer in a quiet room, which is
usually sufficient quality, but you could also contact your local university
Linguistics/Phonetics/Speech Science department as they may have suitable
facilities or equipment that they may be willing to offer.
2. They should be done either in one go or at the same time of day over a period
of time, don't record if you have a cold etc., basically trying to keep the
recording conditions as consistent as possible. 
3. Record words and phrases that you would like to use as much as possible,
including names and places that would come up frequently.
4. Also try and record a set of data that has a wide phonetic coverage of
English. One suggestion is to record set A of the Arctic database which is
around 600 sentences. This was designed to have full coverage of the diphones
(a usual base of unit size for concatenative synthesis) of English specifically
for a speech synthesis task. This can be found at 
http://festvox.org/cmu_arctic/
by clicking on the cmu_data file.
5. Try and sound as natural as possible, although this is difficult when you're
reading.

Please get in touch if this is not clear.
All the best,
Sarah Creer

------------------------------------------
Clinical Applications of Speech Technology
Departments of Computer Science and Human Communication Sciences
University of Sheffield
[log in to unmask]
www.dcs.shef.ac.uk/~sarahc

Quoting "Scott-Tatum, Liz" <[log in to unmask]>:

> Thanks Paul, will get in touch with them directly to get feedback on
> what amount of time is required and costs. Out of interest, has it been
> possible to use the Heather voice across a range of communication aids,
> or have you experiences any difficulties related to the platform used?
> 
> Best wishes, 
> 
> Liz
> 
> Liz Scott-Tatum 
> 
> [log in to unmask]
> 
> Communicate
> 
> Walkergate Park
> 
> International Centre for NeuroRehabilitation and NeuroPsychiatry
> 
> Benfield Road
> 
> Newcastle upon Tyne
> 
> NE6 4QD
> 
> Tel: 0191 287 5240
> 
> Fax: 0191 287 5250
> 
> Featurenet  telephone: 8745 65240
> 
> Featurenet fax: 8745 65250
> 
> Email for General Enquiries: [log in to unmask]
> <mailto:[log in to unmask]> 
> 
>  This email and any files transmitted with it are confidential and
> intended solely for the use of the individual or entity to whom they are
> addressed.  If you have received this email in error please notify the
> system manager. The views expressed by the sender may not be the views
> of Northumberland, Tyne and Wear NHS Trust.
> 
> Please note that any correspondence naming patients of Northumberland,
> Tyne and Wear NHS Trust will be treated as primary record and a printed
> copy may be filed in the patient's health record.
> 
> ________________________________
> 
> From: A discussion list for Assistive Technology professionals.
> [mailto:[log in to unmask]] On Behalf Of Paul Nisbet
> Sent: 08 July 2008 10:14
> To: [log in to unmask]
> Subject: Re: Voice sampling enquiry
> 
>  
> 
> The company that Patrick refers to is Cereproc. They specialise in
> creating synthetic computer voices, and as far as I understand it this
> involves recording the voice actor for some number of hours. Then they
> can create a bespoke voice based on the recording. So I think they could
> create the voice bank and synthetic voice that you're after.
> 
>  
> 
> They're a small company and very helpful, in our experience. As Patrick
> says, we have licenced their 'Heather' voice so that schools and pupils
> in Scotland can download it and install it free of charge on their
> computers - see http://www.theScottishVoice.org.uk 
> 
> .
> 
> Cereproc are at:
> 
>  
> 
> http://www.cereproc.com/contact.html
> 
>  
> 
>  
> 
> Paul
> 
> _______________________________________________
> 
> Paul D. Nisbet
> 
> Senior Research Fellow
> 
> Communication, Access, Literacy and Learning (CALL) Scotland
> 
> Moray House School of Education
> 
> University of Edinburgh
> 
> Paterson's Land, Holyrood Road
> 
> Edinburgh EH8 8AQ
> 
> Tel. 0131 651 6236     Fax 0131 651 6234
> 
> email [log in to unmask]
> 
>  
> 
> CALL Centre:    http://callcentrescotland.org.uk 
> 
> SQA Digital Exam Papers: http://www.AdaptedDigitalExams.org.uk 
> 
> The Scottish Computer Voice: http://www.theScottishVoice.org.uk 
> 
> Books for All:    http://www.booksforall.org.uk  
> 
> Books for All blog: http://pauln.edublogs.org/
> 
> WordTalk reader for Word: http://www.wordtalk.org.uk 
> 
>  
> 
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
> 
> _________________________________________________
> 
> ________________________________
> 
> From: A discussion list for Assistive Technology professionals.
> [mailto:[log in to unmask]] On Behalf Of Scott-Tatum, Liz
> Sent: 08 July 2008 09:55
> To: [log in to unmask]
> Subject: Re: Voice sampling enquiry
> Importance: High
> 
>  
> 
> Thank you Patrick, we'll certainly be giving CALL Scotland a ring.
> 
> Best wishes, 
> 
> Liz
> 
> Liz Scott-Tatum 
> 
> [log in to unmask]
> 
> Communicate
> 
> Walkergate Park
> 
> International Centre for NeuroRehabilitation and NeuroPsychiatry
> 
> Benfield Road
> 
> Newcastle upon Tyne
> 
> NE6 4QD
> 
> Tel: 0191 287 5240
> 
> Fax: 0191 287 5250
> 
> Featurenet  telephone: 8745 65240
> 
> Featurenet fax: 8745 65250
> 
> Email for General Enquiries: [log in to unmask]
> <mailto:[log in to unmask]> 
> 
>  This email and any files transmitted with it are confidential and
> intended solely for the use of the individual or entity to whom they are
> addressed.  If you have received this email in error please notify the
> system manager. The views expressed by the sender may not be the views
> of Northumberland, Tyne and Wear NHS Trust.
> 
> Please note that any correspondence naming patients of Northumberland,
> Tyne and Wear NHS Trust will be treated as primary record and a printed
> copy may be filed in the patient's health record.
> 
> ________________________________
> 
> From: A discussion list for Assistive Technology professionals.
> [mailto:[log in to unmask]] On Behalf Of Patrick Poon,
> Communication Matters
> Sent: 07 July 2008 16:19
> To: [log in to unmask]
> Subject: Re: Voice sampling enquiry
> 
>  
> 
> 	We have recently beed approached by a person with a degenerative
> condition 
> 	who wants to do some voice banking, before their speech
> deteriorates 
> 	further.  We would really appreciate any advice the group can
> offer 
> 	regarding how we could go about this, any specific software
> which can be 
> 	used, what quality the sounds are, and how easy it is to use the
> samples 
> 	on different voice output communcaition aids.  Finally is it
> possible to 
> 	take sufficient voice samples to create their own synthesised
> voice - if 
> 	so what would be needed to achieve this?  
> 
> 
> Your question re. creating a bespoke synthesised voice: you should
> contact CALL Scotland (Paul Nisbet or Sally Millar) as they have been
> working with the techie people at Edinburgh Uni who created the Scottish
> speech synthesiser using sampled speech.
> 
> 
> 
> Communication Matters (ISAAC UK) 
>   c/o ACE Centre, 92 Windmill Road, Headington, Oxford OX3 7DR, UK
>   General Enquiries: Tel & Fax 0845 456 8211 
>   International: Tel & Fax +44 131 467 7487
>   Email: [log in to unmask]
> 
> Come and browse our Web site!
>   http://www.communicationmatters.org.uk
> 
> <http://www.communicationmatters.org.uk/> Registered Charity No. 327500
> Registered Company in England & Wales No. 01965474 
> 
> 
> 
> 
> ________________________________
> 
> The information contained in this e-mail may be subject to public
> disclosure
> under the NHS Code of Openness or the Freedom of Information Act 2000.
> Unless the information is legally exempt, the confidentiality of this
> e-mail
> and your reply cannot be guaranteed.
> Unless expressly stated otherwise, the information contained in this
> e-mail
> is intended for the named recipient(s) only. If you are not the intended
> recipient you must not copy, distribute, or take any action or reliance
> upon
> it. If you have received this e-mail in error, please notify the sender.
> Any
> unauthorised disclosure of the information contained in this e-mail is
> strictly prohibited. 
> 
> 
> 
> The information contained in this e-mail may be subject to public disclosure
> under the NHS Code of Openness or the Freedom of Information Act 2000.
> Unless the information is legally exempt, the confidentiality of this e-mail
> and your reply cannot be guaranteed.
> Unless expressly stated otherwise, the information contained in this e-mail
> is intended for the named recipient(s) only. If you are not the intended
> recipient you must not copy, distribute, or take any action or reliance upon
> it. If you have received this e-mail in error, please notify the sender. Any
> unauthorised disclosure of the information contained in this e-mail is
> strictly prohibited.
>