Status | Proposal [Chapter 6]Grice92a,Howard-Jones92a. |
Goal | Diagnostic and comparative evaluation of the appropriateness of intonation contours for use in interactive communication contexts. |
Languages | English, but can be applied without effort to other languages. |
Items | Two-part, human-machine dialogue excerpts. |
Procedure | Rating by naive subjects of the appropriateness of ``pronunciation'' using magnitude estimation (see Section 12.3.2). |
Example: Human: I'd like to reserve a flight to Paris on Monday morning.Synthesiser: Are you travelling from London? | |
There is a choice between two protocols: The transcript of the excerpt appears on the screen, then (type A) the text produced by the synthesiser is played out to be rated with respect to appropriateness or (type B) both the text produced by the human and the synthesiser is played out, the latter to be rated. It is recommended to present fifteen exchanges per algorithm and to include test materials based on a hand-annotated intonation version as a reference. | |
Time | Depending on the type of stimuli and the protocol. |
Analysis | Automatic calculation of the geometric mean of the responses. Because of possible effects of differences in segmental quality between the dialogues on the ratings, it is advised to calculate the ratio between the text-to-speech version and the hand-annotated version. |