Next: Role of statistical analysis
Up: Introduction
Previous: Introduction
This chapter is about methodology for assessing various
components involved in language engineering:
how to go about
sampling enough speakers to ensure that you
can make claims about how likely the results are to generalise to
a speaker population at large (where population refers to your
target market and will vary from application to application);
how to compare performance of your recogniser
or synthesiser with others
that are on the market;
how many speakers to include in benchmark
tests of speaker verification systems to appraise performance, and
so on.
For these purposes, an understanding of how to analyse your
data statistically is needed.
At other times a user might need to test some very specific
idea about, for example, what is going on in his
recogniser ,
whether some gambit for mimicking other people's voices will allow
impostors to break into a speaker verification
device, what the
critical acoustic attributes are that govern the perceptibility of
a message in order to improve the systems and how to set up
experiments with dialogue systems to check whether they will work
adequately for some purpose before committing design engineers to
their implementation. The way of approaching the latter group of
questions calls for an understanding of the steps involved in
setting up and analysing experiments.
The information provided is, then, going to cover general
techniques from many diverse areas both in terms of techniques
(statistics and experimentation) and applications (including the
above examples and many more). Therefore, this chapter cannot hope
to
be exhaustive in terms of its coverage nor choose an example for
assessment which is directly applicable to all needs. However,
though there will not be an example for every application
encountered, the methodological tools provided should offer some
idea of the way to approach many problems that will be encountered.
The particular examples for illustration have been chosen in
consultation with authors of some of the other chapters. The chapter
can be expected to provide information on the following points.
- What will not be presented here are statistical analyses of,
for example, the statistical corpora described in other chapters.
What is presented here is some of the background that will allow
access to the ideas and literature appropriate for tackling the
analyses themselves.
- Statistical development, experimental techniques and
engineering products and techniques are advancing at a rapid pace.
However, statistical and experimental know-how has not featured to
any great extent in language engineering, and statisticians and
experimentalists usually have not drawn on examples or
considered the engineers' concerns. Thus, many of the
``recommendations''
made here are a first attempt to tackle these issues. There are often many
ways of achieving a particular goal and the limited number of
options that it is possible to consider can only give a narrow
perspective. As these ideas are tried out, other preferred
alternatives will undoubtedly arise. Thus, at least some of the
recommendations are likely to be short-lived.
- It has to be assumed that some of the readers of this
handbook have had practically no previous experience in the formal
methods of statistical analysis. For this reason, it is necessary
to cover basic background in statistics in some detail. On the
other hand, authors of the other chapters have raised questions
about how to address questions statistically which call for
advanced techniques. In a chapter of this size, it is not possible
to cover both or, to some extent, either topic comprehensively
(even introductory texts in statistics usually run to 400 pages).
In the text, we have attempted to cater for the needs of both
groups: For those with no statistical background a swift overview
of the basics is given with illustrations of how these techniques
apply to language engineering problems. The more advanced topics
are dealt with by pointing out when a topic may be appropriate and
the steps to go through. Since those individuals who will want to
use these more advanced techniques usually already have some
understanding of statistics, at this stage they will have to go to
one of the texts referred to for dealing with the actual
computational steps.
- Experimentation also raises problems of scope, depth and
rigour: For instance, it would be straightforward to describe
phonemic labelling of a synthetic speech continuum. This might
include describing a speech continuum and the phoneme labels
required as responses. However, the scope of such an enterprise
would be limited to a narrow branch of speech output systems which
are not necessarily the most pertinent for language engineering.
Considerable research effort has been expended on going into the
details of how the results of assessments like these relate to those
employing other psychophysical procedures, which statistical
analysis procedures are appropriate, the involvement of memory
processes in perceptual decision and so on. Rigour would
dictate that all these need to be considered as well as alternative
theoretical interpretations of the results of such experiments.
Here, as with many of the procedures outlined, followers of one
theoretical line stress the importance of different controls in the
assessment procedures. Outlining one as a state-of-the-art
benchmark is not going to satisfy everyone. The alternative, to
present all variants of the procedures and detail their theoretical
ramifications, is clearly not possible in a handbook chapter.
We will give the general requirements behind constructing experiments
as well as representative illustrations of particular types of
experiments but do not assume that these represent universal standards.
Next: Role of statistical analysis
Up: Introduction
Previous: Introduction
EAGLES SWLG SoftEdition, May 1997. Get the book...