Next: Summary of the current
Up: Criteria for assessment of
Previous: Types and specificities of
- 1. National and/or academic initiatives:
-
In every country where there is a history of speech research,
academics (universities, speech research labs) produce the databases
they need. However, these corpora tend to be specific to the needs of
the producer and rest the property of the producer.
- 2. EEC initiatives:
-
EEC projects are a catalyst of production. Both for basic research and
application-oriented databases, they are a way of developing links
between academics and industry. Corpora also tend to remain the
property of the consortium. As the conortium members are both
academic/research and industrial, the needs cover all areas.
- 3. Telecommunication/telephone sectors:
- Major telephone
operators historically have been interested in speech technology, and
most have their own research centers which collect the necessary
corpora for their research activities. Most of these corpora are not
publicly available.
- 4. Industry:
- Companies developing or integrating speech products need
application-oriented databases. This is true both at national and
international level, where foreign languages represent viable market
opportunities. The data that can be provided by industry is varied,
but for the most part unknown, other than that resulting from EEC
initiatives.
The more of these actors are present in a given country, the more we tend
to find a developed speech community both with existing linguistic
resources, but also with a strong demand for additional resources.
As the speech community grows and the number of speech-based
products extends, the amount of needed resources also grows.
a) need 1.1, 1.2, 1.3 provide 1.1, 1.2 with a), b), c)
provide 1.3 with a)
b) needs variable provide 1.1, 1.2 with a), d)
provide 2 with c)
provide 3 with d), a)
c) need 1.1, 2 provide 1.1 with a), b)
provide 2 with b)
d) need 1.1, 1.2, 2, 3 provide 1), 2) with b)
So far, in reviewing the already existing resources, the presence of
traditional actors, and the on-going projects, we assess the current
situation as follows. Our starting consideration is that the
under-represented European languages will need at minimum the
resources that better-represented languages already have (at least the
basic resources), and that well-represented languages will need still
more resources. These needed resources will come from ongoing
projects, and further needs can be foreseen through interviews with
relevent actors in the speech research community and industry (ISC).
Next: Summary of the current
Up: Criteria for assessment of
Previous: Types and specificities of
EAGLES SWLG SoftEdition, May 1997. Get the book...