Full IPA alphabet encoding

Next: References Up: Computer readable phonetic alphabets Previous: IPA subsets

Full IPA alphabet encoding

At the 1989 IPA convention in Kiel a working group was set up to define a coding scheme for the IPA symbols.

The IPA numbers are a radical approach to the problem of representing the IPA character set on computers. While in the other approaches there exists a mnemotechnical relationship between the symbols used in the alphabet and the IPA characters, the IPA numbers are arbitrary and have no obvious relationship with the characters they represent.

According to this scheme, each IPA symbol is uniquely identified through an IPA name and an IPA number in the range of 100 to 999. The range of numbers is divided into classes, e.g. 1nn for consonants, 3nn for vowels, etc. Space is reserved for future extensions and for private use, and even symbols no longer in use are included for backward compatibility (Esling 1990). The IPA encoding has undergone two major revisions, with the current version being that of 1993 (Esling and Gaylord 1993). The IPA code table has become a section of its own in the Unicode and ISO 10646 (UCS-2) standards.

Tables A.5 to A.15 (see below) contain the complete IPA character set ordered by IPA number. They were kindly provided by John Esling (pdb@uvvm.uvic.ca).

The IPA code table is an efficient means of storing phonetic data on computers, while being independent of the character encoding system of any computer platform. Hence, the IPA table, just like the ISO 8859 for ASCII, is often used as a reference standard. Subsets of the IPA, e.g. the national SAM phonetic alphabets, are mapped to IPA symbols, and the translation of one subset to another is possible via the IPA table. IPA numbers have also become a data exchange format for phonetic data since they can be represented in 7-bit ASCII and can thus easily be distributed electronically.

The major problem with the IPA code table is that it is not directly accessible by software. In text processing, the font mechanism is often used to substitute Latin glyphs with phonetic glyphs, resulting in the known mixed text document problems. In databases and communication software, IPA numbers are used as an internal representation, but for display and editing purposes this representation has to be mapped to the code tables of the display or the text processing software.

There have been various proposals on using ASCII or ISO 8859 code tables to represent the IPA symbols. These include the systems by

John Wells of University College London,
Evan Kirschenbaum of Hewlett-Packard Laboratories,
David Branner of the University of Washington, and
James Hieronymus of AT&T Bell Laboratories - the Worldbet system.

The system proposed by John Wells (see also Appendix B) is described in a draft report at

 
http://www.phon.ucl.ac.uk/home/sampa/x-sampa.htm

Kirschenbaum's system can be found at

 
http://alfred1.u.washington.edu:8080/~dillon/ipaascii.html

and a Worldbet description can be retrieved from

 
ftp://speech.cse.ogi.edu/pub/docs/worldbet.ps.Z

(addresses checked in September 95).

Next: References Up: Computer readable phonetic alphabets Previous: IPA subsets

EAGLES SWLG SoftEdition, May 1997. Get the book...