Status | Completely developed software (in
SOAP) allowing the use
of the magnitude and categorical
estimation
scaling
methods. Two variants are recommended: 20-point
categorical estimation
without reference (for test
internal comparison) and magnitude
estimation by line
length, with imaginary ideal speech as a reference (for
test external comparison) [Chapter 7]Howard-Jones92a. |
| |
Goal | Comparative evaluation of overall quality aspects,
particularly acceptability, intelligibility, and
naturalness , for longer stretches of
speech. |
| |
Languages | In principle applicable to any language as long as
suitable stimulus material is available. |
| |
Items | Eight lists of 20 meaningful sentences of varying
syntactic structures and length. For the rating of
intelligibility and naturalness , speech
material is
available for Dutch, English, French, German, Italian,
and Swedish. One list is sufficient for the evaluation of
a synthesiser. Examples: I realise you're having
supply problems, but this is rather excessive and
I need to arrive by 10.30 a.m. on Saturday. |
| |
Procedure | Each aspect of speech is rated by a different group of
subjects (minimally ten). When rating acceptability, it
is recommended that application specific speech materials
are presented to (prospective) users. The ratings are
based on two sentences each time. |
| |
Time | With 160 sentences and a 5 sec interstimulus interval,
the
rating of one scale takes about 20 min. |
| |
Analysis | Automatic. |