Functional adequacy and user acceptance

Next: Methodology Up: Assessing recognisers Previous: Progress

Functional adequacy and user acceptance

Functional adequacy refers to the fact that recognisers have only to perform a limited range of functions (they might not be required to deal with unrestricted speech for example). User acceptance refers to the fact that users might tolerate something that is not perfect. Each of these topics calls for metrics other than percent correct (9.4.2) and can involve subjective judgments on the part of subjects. So, for these topics, it is necessary to consider what is the best way to obtain information from users about the acceptability of a system.

The recommended way of obtaining the information is in the form of a summated rating scale [Likert (1932)]. These scales are constructed by preparing sets of statements designed to measure an individual's attitude about a particular concept (here, for instance, recogniser acceptance). Typically scales are comprised of several different subscales (in assessing user acceptance of a recognition system , these subscales might include response time , format of feedback to user, etc). Respondents indicate the extent to which they agree with each statement by giving a rating (usually between 1 and 5) indicating the extent to which they agree with each statement. In order to counterbalance for response biases , it is usual to phrase questions so that, here for example, questions indicating affirmative user acceptance would lead to low rating responses for some questions and high ratings for others. An example of of question and response format which might be appropriate for assessing user acceptance might be:

1	2	3	4	5
I found the system very easy to use		. . .		Sometimes I experienced difficulties in using the system.

These two questions would tend to lead users to use different poles of the rating scale. During analysis, the scale values need to be reversed.

The advantages of Likert scales are:

All questions constituting a concept or subconcept can be summed to give a composite response.
A quantitative rather than qualitative measure is provided.
Likert scales are relatively cheap and easy to develop.
They are usually quick and easy for respondents to complete.

The construction of questionnaires based on Likert's scale format for the items of any identified concept involves going through a sequence of steps:

Define the concept or set of subconcepts to measure.
The literature needs to be reviewed to ratify the concepts identified and check whether others ought to be included. Care should be taken to ensure that these are clearly and precisely defined: A scale cannot be developed until it is clear exactly what is intended to be measured.
Design the scale.
Scales will be defined for each item. This will be based on Likert's format where appropriate. The format is not appropriate for collecting information on some concepts (principally, demographic details). At this stage response choices will be specified and instructions formulated for the evaluative basis of each item. A pool of items will be generated at this step, which will be subjected to statistical analysis in later steps.
Administration and item analysis.
Factor analysis may be used for two purposes for validating the scales:
1. Exploratory Factor Analysis . This is used for studying the mulitidimensionality of the Likert scales that underlie a concept. Two aspects of this are (a) to establish the number of factors that best represent the items, and (b) the interpretation of the factors .
2. Confirmatory Analysis: Exploratory factor analysis provides an optimum statistical description of the data. However, scale construction is premised on certain assumptions about what the scale is intended to measure (e.g. response time ). Confirmatory Factor Analysis (CFA) , may be used to verify the latter hypothesised factor structure. This can be performed using one of the available covariance structure modelling programs such as LISREL [Joreskog & Sorbom (1984)] and ESQ [Bentler (1985)].
Validate and produce norms.
The statistical procedures for analysing the factor analyses are available in the standard texts referred to. The outcome of step 3 will typically not produce a sufficiently high level of reliability during this analysis. Step 3 will have to be repeated iteratively until an acceptable level of reliability has been achieved. At that point the questionnaire and normative outputs are available directly from the analysis.

Next: Methodology Up: Assessing recognisers Previous: Progress

EAGLES SWLG SoftEdition, May 1997. Get the book...