next up previous contents index
Next: Translation of code tables Up: Multi-Byte encodings Previous: ISO 10646

The Unicode standard

The Unicode standard is a proposal for a universal two-byte code table for all major written languages. It uses only accepted and official encoding standards for each alphabet to avoid compatibility and acceptance problems. A non-profit organization has been founded to promote the Unicode standard:

Unicode, Inc
P.O. Box 700519
San Jose, CA 95170-0519

Tel: +1 408 777 5870
Fax: +1 408 777 5082

In the Unicode standard, any glyph is stored only once, and font modifications do not change the essential shape of a glyph. Each glyph has a unique name, number, and content. As a consequence, there no longer exists the notion of a mixed text document.

The code table is divided into sections. The first 256 entries are identical to ISO 8859-1 for compatibility reasons, the other sections contain mathematical symbols, phonetic symbols, non-Latin scripts, vendor specific code tables, and the ideographic alphabets. A rather large section is not standardised, it is reserved for proprietary code tables.

Currently, version 1.1 of the Unicode standard has been published - some 5400 characters from ISO 10646 were added to the code table, and some characters were moved to new locations - and various vendors have announced the support of the Unicode standard (a list of applications that comply with the Unicode standard is available from the Unicode Inc. WWW pages).

EAGLES SWLG SoftEdition, May 1997. Get the book...