Next: Readings in interactive dialogue
Up: Specification and design
Previous: Design by simulation
Most of the spoken language dialogue systems which have been created
so far (SUNDIAL, VODIS, PAROLE, etc.) have used analysis of real
dialogues and
simulated dialogues before proceeding to implement a system. These
data have, of course, been augmented by designers' intuitions to fill
genuine gaps in the data. For example, observation and simulation
corpora in the travel information domain might not
include mention of all the destinations contained in the timetable.
The design should not be so tied to the data that these deficits can
not be simply rectified. However, caution should be exercised in the use of
intuitions so as not to equip the system with functionality which
it will never need. Experience has shown that the expectation that some
linguistic form might occur is not in itself sufficient grounds
for supposing that it will occur.
Normal practice is to design several sequential versions of the system, each
version benefitting from technology improvements and from analysis of
results of earlier stages.
Designing a simple system-led menu-style small vocabulary interactive
voice response system consists of the following
steps, taking both human linguistic behaviour and speech technology
performance into account.
- Study the application domain and define what the tasks to be
achieved are and what steps they consist of.
- Translate the sequence of subtasks into a sequence of questions
to be asked by the system and answered by the user, interleaved where
necessary with system internal operations such as database lookup.
- Define the exact wording of the system prompts , and the exact
vocabularies and language models which are appropriate for each recognition.
- Draw up a full specification of the IVR system, integrating the
dialogue flow, system-internal operations, prompting and recognition
constraints.
- Design a first version (X) of the dialogue system.
- Conduct laboratory tests with available
technology using test corpora
where available, and also laboratory staff simulating users.
- Conduct field trials with real users, recording new corpora
where deemed useful.
- ``Tune'' the system by iteratively modifying, then testing it.
- If too many modifications are necessary, respecify and
reimplement the system.
- Design an X+1 version of the system, integrating new technologies.
- Carry out new laboratory tests with the new version.
- Carry out field trials with real users.
- Return to step 9 unless the system is deemed to be complete.
Prompt design is especially important for IVR systems. Since the user
has to follow the system's lead, that lead must be clear, unambiguous,
and reassuring. The following recommendations summarise some simple
steps which can be taken to achieve an effective prompting regime.
- Keep prompts as brief as possible without being terse.
- Keep prompts as simple as possible.
- Use a consistent linguistic style for prompts.
- Ensure that each prompt (except the last) finishes with an explicit
question or command.
- Wherever technically possible, allow users to interrupt the prompt.
- Where prompt interruption is not possible, ensure that either the
recogniser starts listening the instant the prompt
stops playing, or use some audible signal to indicate when speech
may begin.
- If prompts are canned , either use a single speaker or, if more than
one is used, ensure that each speaker serves an intuitively distinct
function.
- Do not expect instructions presented to the user at the start of a
dialogue to be remembered in subsequent turns.
- Wherever possible, re-promptings after errors or absence of user input
should provide extra guidance to help the user behave in the desired
fashion.
- Control variables such as prompt voice quality to give
the system a warm and friendly system ``personality''.
Designing a spoken language dialogue system consists of the following
steps, taking both human linguistic behaviour and speech technology
performance into account.
- Study human-human interaction recordings in a situation similar
to the one in which the system will be used, and make an ergonomic
analysis of the needs or requirements of potential users.
- Carefully define a Wizard-of-Oz simulation, making objectives
explicit.
- Conduct Wizard-of-Oz simulations (preferably using an
iterative WOZ methodology) and record the
complete resulting dialogues.
- Transcribe the dialogues recorded in simulations, (several levels
of transcriptions may be necessary). If possible use a
standard transcription scheme.
- Draw up a specification of the interactive dialogue system.
- Design and implement a first version (X) of the dialogue system.
- Conduct laboratory tests with available technology using corpora
recorded in Wizard-of-Oz simulations, and then with laboratory
staff simulating users, recording new data.
- Conduct field tests with real users, recording new corpora.
- ``Tune'' the system by iteratively modifying, then testing it.
- If too many modifications are necessary, carry out new (bionic or
human) Wizard-of-Oz experiments, allowing for controlling of
different parameters.
- Design and implement an X+1 version of the system, integrating new technologies.
- Carry out new laboratory tests with the new version.
- Carry out field tests with real users.
- Return to step 9 unless the system is deemed to be complete.
In addition to these methodological guidelines, the
spoken language dialogue specification/design process can be expected
to be simplified and
improved if a few extra recommendations are adhered to. (Many of these
summarise points already made in the preceding discussion.)
- Where time and other resources allow, base the specification on data
from a diversity of sources.
- Consult human-human data to learn about the task and to understand the
service expectations which users will bring to the system.
- Conduct WOZ simulations to determine the effect of human-computer
factors for a specific task or application domain.
- Use native speaker intuitions to fill obvious gaps in the
human-human and WOZ corpora, but avoid going beyond this.
- Use an iterative refinement methodology to perfect the
specification.
- Allow sufficient time and resources for the specification process.
- Decide in advance which questions to ask of the data, and wherever
possible stick to them.
- Conduct a dialogue act analysis of the dialogues collected in the
corpora, paying special attention to the conditions which must be
satisfied in order to proceed from one dialogue state to the next.
- Describe the dialogue state transitions using some formally explicit
apparatus (such as a flowchart or formal
specification language).
- Use the data to identify the total lexicon required, then divide it
into sublexicons, where each sublexicon is accociated with a dialogue
act.
- Use the data to identify a covering grammar ,
then divide it into subgrammars , where each
subgrammar is accociated with a dialogue act.
Human reactions to spoken language dialogue systems have to be
observed on the spot. The ideal approach is therefore to design
systems in close collaboration with professional organisations which
have groups of potential users who are willing to critique
specification documents, participate in early trials, and feed back
useful comments.
Next: Readings in interactive dialogue
Up: Specification and design
Previous: Design by simulation
EAGLES SWLG SoftEdition, May 1997. Get the book...