TGA: Time Group Analyser

Dafydd Gibbon (Universitüt Bielefeld)

CONTENTS
  1. TGA GENERAL INFORMATION
  2. TGA INPUT PARAMETERS
  3. TGA INPUT FIELD

1. TGA GENERAL INFORMATION

V 1.00 2012-07-09
V 3.03 2013-03-30
V 3.04 2014-08-11

Papers using TGA methodology for analysis of rhythm, duration sequences and other timing patterns (currently Mandarin Chinese; L1 and L2 English; Polish):
  1. Yu, Jue and Gibbon, Dafydd, Criteria for database and tool design for speech timing analysis with special reference to Mandarin, Oriental COCOSDA 2012 (cf. IEEEexplore Conf ID 21048)
  2. Gibbon, Dafydd, TGA: a web tool for Time Group Analysis, TRASP 2013 (poster)
  3. Yu, Jue, Timing analysis with the help of SPPAS and TGA tools, TRASP 2013 (poster)
  4. Klessa, Katarzyna, Maciej Karpinski and Agnieszka Wagner, Annotation Pro: a new software tool for annotation of linguistic and paralinguistic features TRASP 2013
  5. K)essa, Katarzyna and Dafydd Gibbon, Annotation Pro+TGA: automation of speech timing analysis, LREC 2013.
  6. Yu, Jue, Dafydd Gibbon and Katarzyna Klessa, Computational annotation-mining of syllable durations in speech varieties, Speech Prosody 7, 2014.

Grateful acknowledgments to Jue Yu (Zhejiang University, Hangzhou) for evaluation and suggestions, and for the Mandarin Chinese annotation data in the demo below.

A Python CGI web tool.


Note:
  1. The duration visualisation graphics are not rendered correctly by Firefox, which has a bug in its HTML rendering. The vertical bars are incorrectly shown as circles. However, the information displayed by the circles is the same.
  2. This tool will DEFINITELY NOT work with most TextGrid files, because the tool is designed to handle ONLY one particular small set of ASCII symbols and (obviously) only interval tiers are handled.
  3. Within the above constraints, long or short TextGrid Interval Tier formats are handled. The tool was designed for syllable tiers, but in principle any tier can be handled. You will be lucky not to get arbitrary error output if you try anything else. Response time depends on TextGrid tier length and (for deceleration and acceleration) global threshold range. Be patient!
Automatic recognition heuristic for input data formats (examines only first line, not foolproof)
  1. Praat TextGrid full and short formats (specified tier picked out of arbitrarily many tiers)
  2. Single-tier CSV table (do not mix separators in the same data set):
      row := label sep starttime sep endtime [ sep duration ]
      sep := TAB | SP | "," | ";" | ":"
  3. Timestamp values in all formats are in seconds with dot decimal point (not milliseconds, and not comma), following Praat TextGrid conventions.

2. TGA INPUT PARAMETERS

Parameters (1): TextGrid input control
Tier name: (change if necessary; max length 20; not needed for CSV formats)
Boundary (e.g. pause) symbol: (typical examples; max length 20; also needed for CSV formats)
More than one pause symbol is permitted; separate symbols with spaces. Delete/change as necessary. If your pause symbol is not in the examples given, enter it

Parameters (2): Time Group duration difference criteria:
TG criterion: pausegroup deceleration (increasing) acceleration (decreasing)

Local threshold: ms (for syllable annotations, try values less than common syllable lengths, e.g. 0 ... 300 ms)
Used for local pattern extraction and TimeTree parsing.
Local pattern symbols:Longer: (1 char) Shorter: (1 char) Same: (1 char)
Time Tree criterion:
(quasi-)iambic TTgt (quasi-)trochaic TTlt show all TT
(quasi-)iambic TTgte (quasi-)trochaic TTlte do not show TT

Global TG threshold range: ... ms (minimal duration difference)
Ranges > 30 are not permitted because of possible server overload.
Global threshold is ignored with the 'pausegroup' criterion.
Experiment with values from 0 to 500 (negative values are permitted).
Equal range boundaries are adjusted to have range of 1, not null; if necessary values are switched to ensure 'low before high'.
Min TG length: > (generally >2, as 'minimal rhythm')

Parameters (3): Time Group output control:
Print text? no yes
TG element info? no yes
TG detail? no yes

n-grams? no yes
Time Trees? no yes
CSV output?

All outputs (except csv): yes no

TGA PROCESS:
(Calculation can take time, depending on server load.)

3. TGA INPUT FIELD

Note:
1. Test with the preset TextGrid example, then delete, copy and paste your own TextGrid or CSV):
2. Input timestamp values in all formats are in seconds with '.' (dot) decimal point (following Praat TextGrid conventions), and neither milliseconds nor seconds with ',' (comma) decimal point. Output timestamp values are in milliseconds.

2012-07-09 V 1.0Basic Syllable and Time Group parser with deceleration and acceleration criteria
2012-08-15 V 1.1Bugfix and enriched output
2013-03-10 V 2.0Cycle through threshold range
2013-03-12 V 2.1Picks specifiedtier out of unedited TextGrids with arbitrary number of tiers
2013-03-13 V 2.2Introduction ofpause group parsing option
2013-03-13 V 2.3Additional quantitative output; syllable tier symbol to be input by user
2013-03-21 V 2.4Further modularisation, more input options, no functional difference
2013-03-23 V 2.5Someerror proofing. Use of numpy.
2013-03-23 V 2.6Locallonger-shorter-equal pattern visualisation
2013-03-23 V 2.7Pattern input options
2013-03-30 V 2.8Pattern testing
2013-03-30 V 2.9CSV input added; parse summary information extended
2013-03-30 V 2.10TimeTrees added.
2013-03-30 V 3.0SD added to TimeGroups; various correlations; blob duration visualisation.
2013-03-30 V 3.01ndiff analysis added to TimeGroup patterns.
2013-03-30 V 3.01ngramm and time tree analysis; various format and output additions.
2013-03-30 V 3.02some modularisation; various format and output additions.
To do:CV set input etc.; sep calc fr pr

Created:Tuesday, July 10, 2012 7:25:09 AM CEST.
Last Modified: Tuesday, April 16, 2013 7:03:42 AM CEST
D. Gibbon