Functional adequacy refers to the fact that recognisers have only to perform a limited range of functions (they might not be required to deal with unrestricted speech for example). User acceptance refers to the fact that users might tolerate something that is not perfect. Each of these topics calls for metrics other than percent correct (9.4.2) and can involve subjective judgments on the part of subjects. So, for these topics, it is necessary to consider what is the best way to obtain information from users about the acceptability of a system.
The recommended way of obtaining the information is in the
form of
a summated rating scale [Likert (1932)]. These scales are
constructed by preparing sets of statements designed to measure an
individual's attitude about a particular concept (here, for
instance,
recogniser acceptance). Typically scales are comprised of several
different subscales (in assessing user acceptance of a recognition
system , these subscales might include response time , format of
feedback to user, etc). Respondents indicate the extent to which
they
agree with each statement by giving a rating (usually between 1 and
5)
indicating the extent to which they agree with each statement. In
order
to counterbalance for response biases , it is usual to phrase
questions
so that, here for example, questions indicating affirmative user
acceptance would lead to low rating responses for some questions
and
high ratings for others. An example of of question and response format
which might be appropriate for assessing user acceptance might be:
1 | 2 | 3 | 4 | 5 |
I found the system very easy to use | . . . |
Sometimes I experienced difficulties in using the system. |
These two questions would tend to lead users to use different poles of the rating scale. During analysis, the scale values need to be reversed.
The advantages of Likert scales are:
The construction of questionnaires based on Likert's scale format for the items of any identified concept involves going through a sequence of steps:
Factor analysis may be used for two purposes for validating the scales:
The statistical procedures for analysing the factor analyses are available in the standard texts referred to. The outcome of step 3 will typically not produce a sufficiently high level of reliability during this analysis. Step 3 will have to be repeated iteratively until an acceptable level of reliability has been achieved. At that point the questionnaire and normative outputs are available directly from the analysis.