Next: Label file format Up: SAM file formats Previous: Introduction

Speech file and associated description file formats

It is now agreed as a standard for SAM speech databases, that a speech file contains only speech waveforms, and that an associated description file is generated at the recording session. Thus the files are matched, their names being identical, except for the last letter of the extension.

For example, if the speaker AA records the corpus number BB (list of six sentences in English), and the current available file number in the recording lab is nnnn, the files produced will be:

AABBnnnn.SES sampled speech

(AABBnnnn.SEL L for Laryngograph)

(AABBnnnn.SE2 for the second channel signal file)

AABBnnnn.SEO associated description file generated automatically during recording.

(O = orthographic time-aligned labelling)

The associated description file has standard label file format, with a header and a body. (see C.3.1 Header format for label files; C.3.2 for body of label file). It contains all the information usually needed by people working on the files without a database management system.

EAGLES SWLG SoftEdition, May 1997. Get the book...

AABBnnnn.SES	sampled speech
(AABBnnnn.SEL	L for Laryngograph)
(AABBnnnn.SE2	for the second channel signal file)
AABBnnnn.SEO	associated description file generated automatically during recording.
	(O = orthographic time-aligned labelling)