TGA: Time Group Analyzer
Speech Annotation Data Mining
Dafydd Gibbon (Universität Bielefeld)
Demo of spoken English, from the Aix-Marsec corpus
(Calculation can take time, depending on server load.)
TGA INPUT, PROCESSING, OUTPUT PARAMETERS
Parameters (1): TextGrid input control
Metadata:
Language: spoken English Source: Aix-Marsec corpus Default tier: Syllables Format: Praat short TextGrid
Tier name:
(change if necessary; max length 20; not needed for CSV formats)
The current version of the TGA online tool analyses sequential and hierarchical temporal relations on single tiers only, not temporal overlap relations between tiers.
Boundary (e.g. pause) symbol:
(typical examples; max length 20; also needed for CSV formats)
More than one pause symbol is permitted; separate symbols with spaces. Delete/change as necessary. If your pause symbol is not in the examples given, enter it.
Do NOT use spaces or empty labels as pause markers. Items with these are deleted in order to permit the analysis of sparse, opportunistic, 'agile' annotations.
Note that the last label on the selected tier in your annotation
must
be a pause symbol.
Parameters (2): Time Group duration difference criteria:
TG type:
interpausal group
deceleration
(increasing duration)
acceleration
(decreasing duration)
Local threshold:
ms (minimal duration distance recognized, e.g. 0 ... 300 ms for syllables)
Used for local Duration Difference Token extraction and Time Tree parsing
.
DDT symbols:
Longer:
(1 char)
Shorter:
(1 char)
Same:
(1 char)
(Symbols for local threshold dependent difference Duration Tokens)
TT type:
(quasi-)iambic TTgt
(quasi-)trochaic TTlt
show all TT
(quasi-)iambic TTgte
(quasi-)trochaic TTlte
do not show TT
Global TG threshold range:
...
ms (minimal duration difference for accelation/deceleration TG types)
Ranges > 30 are not permitted because of possible server overload
.
Global threshold is ignored with the 'pausegroup' criterion
.
Experiment with values from 0 to 500 (negative values are permitted).
Equal range boundaries are adjusted to have range of 1, not null; if necessary values are switched to ensure 'low before high'.
Minimum TG length:
(generally >2, to capture a 'minimal rhythm')
Parameters (3): Time Group visualization and output control:
Graph output smoothing:
Median (outliers):
Mean (moving average):
Rhythm visualization:
Lag (AMDF):
Clip (peak picker):
Print text?
no
yes
TG element info?
no
yes
TG detail?
no
yes
n
-grams?
no
yes
Time Trees?
no
yes
CSV output?
none
default csv (compatible with TGA input)
tier,pause,label,start,end,dur
label,start,end (compat w TGA input)
labels
label unigram freq list
label digram freq list
label trigram freq list
start timestamps
end timestamps
durations (syllables)
duration (syllables + pauses)
durations (pauses)
z-scores (syllables + pauses)
z-scores (syllables)
z-scores (pauses)
Δdur tokens
Time Trees
all TG lists
All outputs (except csv):
yes
no
(Calculation can take time, depending on server load.)
TGA INPUT FIELD
Note:
1. Test with the preset TextGrid example, then delete, copy and paste your own TextGrid or CSV):
2. Input timestamp values in all formats are in
seconds with '.' (dot) decimal point
(following Praat TextGrid conventions), and neither milliseconds nor seconds with ',' (comma) decimal point. Output timestamp values are in milliseconds.
File type = "ooTextFile short"