Reading guide

Next: Interactive dialogue systems Up: Introduction Previous: About this chapter

Users of this handbook are likely to have needs which fall into three distinct categories. First, end user organisations will wish to compare existing interactive dialogue systems in order to select the best solution for some particular purpose. Second, system engineers will wish to gauge the performance of existing systems for diagnostic purposes, in order to improve their performance. Third, system designers will wish to learn how to go about designing an interactive dialogue system from scratch. These three needs are addressed below.

Comparing existing systems

It is notoriously difficult to compare existing dialogue systems, as they really ought to be compared in exactly the same conditions. In fact, comparison depends on the degree of system integration. Dialogue managers (as distinct from dialogue systems) should ideally be evaluated independently of the speech technology (recogniser and synthesiser ) and of the application domain. In fact, this is rarely possible: for instance, dialogue prediction and correction procedures are heavily dependent on the recogniser performance and its linguistic analysis components. A dialogue system is also rarely completely independent of the application domain or, at least, of a class of applications. Even for the same application, the interface might be different (for air-traffic control training , for instance, there exist different air-traffic simulators with different levels of complexity). Interfaces between the system and the speech technologies on the one hand, and the system and application on the other are not at present general-purpose. Adaptation is always necessary. Complete systems developed for the same application domain could however be assessed on corpora of similar complexity, corresponding to the same pre-defined scenarios , but as they have different internal architectures, with different actual components (which need not coincide with abstract components), only a black box assessment might be envisaged.

This chapter aims to make these issues accessible to people who may lack extensive experience in speech and language technology and who wish to compare existing systems.

Improving existing systems

Improving existing systems may aim either to improve the system performance (overall or parts of the system), or to render the system more independent of either the speech technologies or the application.

Improving system performance can be achieved by means of an iterative process described below, or by conducting independent tests of single components (parser , interpretation module, for instance) or parameters (using different prediction mechanisms).
Rendering the system independent of the application (vocabulary, syntax , etc.) requires the development of specialised interfaces and specific tools which allow the user easily to change the semantic domain by describing the associated knowledge (vocabulary, constraints rules, etc.) in an interactive way. This requires that a knowledge compiler tool exists which can compile the data into a form usable by the dialogue system, and also that coherence verification procedures exist to assure the compatibility of the content of syntactico-semantic knowledge with the linguistic information in recogniser and synthesiser languistic analysis modules.
Rendering the system independent of the speech technology helps to make it portable and not reliant on particular technologies. However, this may not serve the needs of improving system performance, as better performance is likely to be obtained if the system is tightly tuned to the recogniser performance, at the expense of its portability .

By outlining a framework for testing , respecifying and enhancing systems, this chapter provides a way into this complex problem.

Designing new systems

A background activity to designing new systems is to try to assess existing systems to the limit of their possibilities (maximum number of words, for instance), assigning limit values to their variables (vocabulary size , number of semantic frames or concepts, etc.). The results of the evaluation of systems which deal with similar tasks will also be of considerable relevance here. Besides, designing new systems assumes that several analyses have been done beforehand, based on the following procedures:

Analyse the usual human interaction if such a model exists, preferably with real recorded dialogue.
If necessary, take speech technology performance into account, and specify the common knowledge which will have to be shared (the linguistic information used in the recognition and analysis process, for instance).
Specify the interface with the application (specific coded language used such as SQL, for instance), and the common knowledge which will have to be shared between the system and the application (which may be crucial when the application domain knowledge is evolving during the dialogue). This will help to design the task model.
Define the knowledge to be included in the user's model (taking classes of users into account, if necessary).
Define the dialogue strategies which are necessary (depth or shifts authorised, corrective and/or predictive procedures when necessary).

These, and other related tasks, are explained in this chapter in the context of system specification and design, along with some detailed procedures for progressing from an initial goal to a final working system (see also Chapter 2).

Section summary

It is important to understand exactly the nature of the technology with which this chapter is concerned, and to master the technical terms which will crop up again and again throughout the chapter. These needs are addressed in Section 13.2, on Interactive dialogue systems.

Interactive dialogue systems are highly complex systems, incorporating many different technologies. Section 13.3, Specification and design, reviews some of the approaches which have been adopted to the problem of specifying and designing such systems. This section concentrates on specifying the functionality of interactive dialogue systems. Detailed recommendations based on practical experiences of workers in the field are included.

Once an interactive dialogue system has been specified, designed and implemented, the task of assessing how well the system performs is a non-trivial task. Section 13.4, Evaluation, looks at what makes the problem difficult, describes a framework within which evaluation may take place, and suggests a core set of metrics which can be used for comparing different interactive dialogue systems.