next up previous contents index
Next: Text-to-speech synthesis Up: Speech synthesis Previous: Speech recordingstorage, and

 

Canned speech

In some cases the number of sentences that may be played back is too large to be recorded. For example if the application deals with flight services then the system should be able to give information about flight numbers such as: ``Flight A F Nine Three Zero One from PARIS will land at ELEVEN HUNDRED TWENTY FIVE.'' It is obvious that the storage capacity and the recording effort necessary to obtain all the possible combinations of sentences are important. So the approach consists of recording independent speech segments such as: ``Flight'', ``from'', ``will land at'', names of all the quoted cities, the digits, and the alphabet. The sentence to play back  is obtained by linking appropriate segments together through a substitution  of individually stored words in the information slots  of the  carrier sentence:

``Flight'' {tex2html_wrap_inline44835} {tex2html_wrap_inline44837} ``from'' {city} ``will land at'' {hours}{minutes} ...

This is usual in applications like schedules, bank balances, etc.

The scenario  to perform consists of listing all the sentences that may occur. The application developer should know whether the technology provider supplies any tool that generates all the possible sentences (generation of a written version of the sentences to check). He should be able to select the words to be individually stored (the variable parts) from the carrier sentences   (the fixed ones). Afterwards one has to define the recombination rules to account for the language specific characteristics (assimilation  of adjacent words, coarticulation  effects, etc.). For example in some languages like French there is a particular liaison  that imposes contextual rules and exceptions: the number ``21'' is pronounced /v~ tex2html_wrap_inline44839 te ~ tex2html_wrap_inline44839/ and cannot be produced by simply concatenating 20 and 1 (/v~ tex2html_wrap_inline44839 e ~ tex2html_wrap_inline44839/). The corresponding rules should be delivered by the technology provider or implemented by the application developer using a suitable development environment. 

 



next up previous contents index
Next: Text-to-speech synthesis Up: Speech synthesis Previous: Speech recordingstorage, and

EAGLES SWLG SoftEdition, May 1997. Get the book...