LangDoc2015 course notes: Dafydd Gibbon
Course plan
- Basics: Text linguistic foundations of language documentation
- Basics: Spoken language documentation
- Speech recording, transcription and annotation
- Project report: Documentation of a language
1. Basics: Text linguistic foundations of language documentation
Course notes:
Assignment: What is the text grammatical structure of a dictionary?
Document creation: stylesheet templates and information:
You can use the text grammar principles for document formation to design your project report.
Assignment: Find out how to create a table of contents automatically.
(I have removed the old LREC conference stylesheet because it was inconsistent.)
Illustration of language group documentation:
- DistGraph: an online tool for difference analysis and visualisation
In response to a suggestion during Monday's class, in addition to the language pairs I have now implemented the lists of differences. Thanks for giving me the idea!
2. Basics: Spoken language documentation
Practical introduction to Praat
- See below for the course materials, with a link to the Praat website. Download and save the Praat for Windows ZIP file on your Desktop. Extract the contents of the ZIP file. You will then have a Praat folder on your Desktop, containing the Praat software.
- Download the audio file below ("A tiger and a mouse were walking in a field" and save in the Praat folder.
- Run Praat, and open the audio file in Praat.
- The first part of the second section of the course will be concerned with
- The information about the speech signal which Praat provides.
- Annotation of the speech signal and understanding TextGrid files.
- The next part of this section of the course will be concerned with automatically analysing TextGrid annotation files ('annotation mining') using the TGA online tool (see below).
Assignment:
- Find the maximum, minimum and average syllable duration in the recording "The tiger and the mouse were walking in a field".
- In order to do this:
- Annotate all the syllables in the recording on a tier called "Syllable".
- Save the TextGrid annotation file.
- Open the TextGrid annotation file in a text editor.
- Open the TGA online tool in a browser.
- Go to the "read-aloud Tem" demo. Delete the existing demo TextGrid and copy and paste your own TextGrid.
- Press the "TGA CALCULATE OUTPUT" button and examine the output to find the answer to the assignment question.
Course notes:
- 02-BasicsOfSpokenLangDoc.odp (89K)
- 02-BasicsOfSpokenLangDoc.pdf (156K)
Materials:
- Short Phonetics Course (Based on the Abidjan May 2014 course.)
- Very short WAV file: tiger
- Short WAV file: The tiger and the mouse were walking in a field."
Speech documentation tools:
- Praat: doing phonetics by computer
- Audacity: speech editor
- Time Group Analyzer: an online tool for duration analysis
- Recommended for annotation with Praat:
SAMPA and X-SAMPA codes, the standard keyboard-friendly encoding of the IPA, from:
Gibbon D. et al. Handbook of Multimodal and Spoken Language Systems. Dordrecht: Kluwer.
3. Project report: Documentation of a language
Completion of your language documentation report.
Various sources
- Dialectometry:
D. Gibbon: Visualisation of distances between languages (ODT)
D. Gibbon: Visualisation of distances between languages (PDF)
- Connell, B., F. Ahoua, D. Gibbon. 2002. Illustrations of the IPA: Ega. JIPA.
- Krauwer's BLARK concept (Basic Language Resource Kit)
- Ega documentation archive
D. Gibbon, Montag, 2. November 2015, 07:27:17 Uhr CET