Non-Deterministic Finite State Transducers (NDFST)

Dafydd Gibbon, U Bielefeld, 2012-09-13


General intro

An NDFST is a variety of Finite State Transducer, FST, and defines a kind of translation between two codes. Check this example, which translates very restricted English expressions into German and vice versa. The demo shows how a single input string can have many translations. The FST can be represented as a network:

test-ndfst.gif

Skip to the Practical Stuff section if you're not really interested in Formal stuff.


Formal Stuff

  1. A Finite State Transducer, FST (also known as a Finite State Machine or State Machine) defines a relation between two sets of strings of symbols, an input set and an output set (whose roles can be reversed). The sets of strings must be regular sets, i.e. the strings must be composed by concatenating to the left or to the right (but not both), and not by insertion into the middle. The sets may be infinite, as in {big, very big, very very big, ...}. The relation between regular sets is a regular relation.
  2. The FST is defined by specifying (1) a set of states; (2) the start state; (3) a subset of states which are possible end states; (4) a set of transitions between states representing (5) a relation between input and output symbols (see the Figure).
  3. A Deterministic Finite State Transducer, DFST, is a variety of FST where for each input symbol at a given state there is only one path leading to another state.
  4. A Non-Deterministic Finite State Transducer, NDFST or NFST is a variety of FST where for each input symbol at a given state there may be more than one path leading to another state, needing additional criteria for deciding which path to take.
  5. An interpreter for an FST processes an FST and translates input strings into output strings. An FST interpreter requires only a finite memory, which is a function of the number of states. The interpreter developed for this workbench takes the easy way out and produces all possible translations of the input string.
  6. If you want to know more, check Wikipedia on Finite State Transducer.

Practical stuff

Input:
The input is a string consisting of a sequence of symbols which are transduced into the output strings which the FST defines.

Note that calculation takes time if the FST is complex and the input is long - watch your browser's download indicator!
The switch settings which inflence the speed are Numbering, Graph, Compact.

Notes on switch settings:
Trace=on is normally needed only for debugging.
Numbering=on additionally displays output item count and (with arrows) transducer Direction selection.
SAMPA is only switched off for phonetic representations.
Graph=on is useful for visualisation in general (in which case it is only needed once - the graph can be saved), but also for debugging.
LR/TB defines the orientation of the graph as left-right or top-down.
Compact=yes shows the graph with sets of parallel transitions reduced to a single arbitrary member of the transition set.
Direction permits input and output to be exchanged.
Relation determines whether only the output symbols or input and output symbol pairs are to be displayed.
Autoinput is only required for some special applications.


Specimen form layout:

Input:
Switches:
Trace: Numbering: SAMPA:
Graph: LR/TB: Compact:
Direction: Relation: Autoinput:

FST:

Dafydd Gibbon, 2012-09-13