Applications of morphology


Morphological structuring is useful for the following tasks:

There are two main ways of structuring words internally into word subunits (word constituents):

  1. SEMANTIC ORIENTATION. On morphological grounds, word forms may be decomposed  into smaller meaningful units, the smallest of which are morphs , the phonological forms of morphemes ; an intermediate unit between the morph and the word form is the stem .
  2. PHONOLOGICAL ORIENTATION. On phonological grounds, word forms may be decomposed into smaller pronunciation units, the smallest of which are phonemes ; an intermediate pronunciation unit is the syllable. 

It is important to note that decomposition into syllables  is not isomorphic with decomposition into morphs. For example, phonological has the syllable structure  /ftex2html_wrap_inline45241 . ntex2html_wrap_inline45173 . ltex2html_wrap_inline45241 . tex2html_wrap_inline45337tex2html_wrap_inline45169 . ktex2html_wrap_inline45173l/ and the morph structure /ftex2html_wrap_inline45241n + tex2html_wrap_inline45173+ ltex2html_wrap_inline45241tex2html_wrap_inline45337tex2html_wrap_inline45169k + tex2html_wrap_inline45173l/, which are quite different from each other.

In addition to phonological decomposition, in the written mode word forms may be decomposed into smaller spelling units, graphemes, each consisting of one or more characters. An intermediate orthographic unit is the orthographic break  (orthographic syllable ), which is in general only needed for splitting words at line-breaks and does not correspond closely to either syllable  or morph boundaries  but combines phonological, morphological and orthographic criteria.

It has already been noted that in many languages, syllables  and morphs do not always coincide; morphs  may be smaller than or larger than syllables .

For the core requirements of speech recognition , in which a closed vocabulary  of attested fully inflected  word forms is generally used, morphological structuring is not necessary. Phonological structuring into syllables , demisyllables , diphone  sequences or phonemes  is widely used in order to increase statistical coverage and to capture details of pronunciation [Browman (1980), Ruske & Schotola (1981), Ruske (1985)].

A brief outline of the main concepts in morphology, as they affect spoken language lexica will be useful in developing spoken language lexica (for more detail a textbook in linguistics should be consulted, e.g.\ [Akmajian (1984)]):

Morphology is the definition of the composition of words as a function of the meaning, syntactic function, and phonological or orthographic form of their parts. The morphology of spoken language is fundamentally the same as the morphology of written language in respect of meaning, syntactic function, and the combinability of morphemes . It differs in respect of morphophonological alternations , which differ from spelling alternations, and word prosody  (for instance word stress  patterns). General definitions are given here; examples are given below.

Morphotactics  (word syntax ) is the definition of the composition of words as a function of the forms of their parts.

Inflection  is that part of morphology which deals with the adaptation of words to their contexts within sentences: on the basis of agreement (congruence), e.g. between subject and verb.

Word formation  is that part of morphology which deals with the construction of words from smaller meaningful parts.

Derivation  is that part of word formation which deals with the construction of words by the concatenation of stems  with affixes  (prefixes  and suffixes ).

Compounding  (composition) is that part of word formation which deals with the construction of words by concatenating words or stems .

Simple morphological units:
Traditional terminology varies in this area. A standard but incomplete definition of a morpheme , for instance, is that it is ``the minimal meaning-bearing unit of a language''. This definition is not entirely satisfactory, however, and for present purposes the sign-based model and the unit of word will be used as the starting point.

A morpheme  is the smallest abstract sign-structured component of a word, and is assigned representations of its meaning, distribution and surface (orthographic and phonological) properties. More informally, morphemes  are parts of words defined by criteria of form, distribution and meaning; i.e. they have meanings and are realised by orthographic or phonological forms (morphs).  They have no internal morphological structure.

Traditionally, the two main kinds of morpheme  are:

Morphs  are, in traditional linguistics, the orthographic or phonological forms (realisations) of morphemes . Orthographic morphs  consist of graphemes (either single letters or fixed combinations of letters); in traditional phonology, phonological morphs  consist of phoneme  sequences with a prosodic  pattern (e.g. word stress ).

Roots  or bases (lexical morphs ) are the morphs which realise lexical morphemes  and inflectable  grammatical morphemes , and function as the smallest type of stem  in derivation   and compounding . Affixes  (prefixes , suffixes ) are morphs  which realise the inflectional  and derivational  beginnings and endings of words.

A free morph is a morph  which can occur on its own with no affixes  or prosodic modifications as a separate word; a bound morph  is a morph (generally an affix) which always occurs together with at least one other morph (typically a stem  in the same word.

Complex morphological units:
The structure of words is, like the structure of sentences, defined recursively, since the vocabulary  of a language (including new coinages) is potentially unlimited.  The functional and formal classification of morphological word structure (compounding  and derivation , see above) takes this into account. Where `out of vocabulary words' are likely to be encountered, morphotactic rules  and a morphological parser  or morphological generator  may be required in order to supplement the lexicon. The condition of recursive structure does not apply to inflection , which, given a finite set of stems , defines a finite set of fully inflected word forms (in agglutinative  languages possibly an extremely large finite set):

Inflectional  affixation: 
A word (fully inflected  word) is a stem  morphologically concatenated with a full set of inflectional  affixes , e.g.\ English algorithm + s = algorithms or German ge + segn + et + en `blessed' (plural participle or adjective).

Derivational affixation:
     A stem  is

  • either a root  (i.e. lexical morph ), e.g. tree, algorithm
  • or a stem  morphologically concatenated with a derivational affix , e.g.\ algorithm + ic, algorithm + ic + al + ly, non + algorithm + ic + al + ly, etc.

A compound  word is a word morphologically concatenated with a word or a stem.

Morphophonological  and orthographic alternations: 
The operation of morphological concatenation is defined for present purposes as ``concatenation and modification of segments at morph boundaries  by boundary phenomena.'' The details of pronunciation and spelling are altered in morphologically complex items. An example of morphophonological alternation  is /f/ - /v/ in knife /natex2html_wrap_inline45169f/ - /natex2html_wrap_inline45169vz/. An example of orthographic alternation  is y - i - ie in fly, flier, flies. These alternants can be described by rules:

  1. Morphophonological rules  are rules (analogous to spelling rules) which describe morphophonological alternations , i.e. the differences between pronunciations of parts of composite words and pronunciations of corresponding parts of simplex words.
  2. Spelling rules are rules which describe spelling alternations, i.e. the differences between spellings of parts of composite words and the spellings of corresponding parts of simplex words.

A standard technology for formulating spelling rules and morphophonological rules  is Two-Level Morphology  (cf. [Koskenniemi (1983)], [Karttunen (1983)]; cf. [Ritchie et al. (1992)]).


