The interaction is mostly structured as menu-driven sequences with a graph or finite state automaton model of the interaction. The transitions between the states of the automaton consist of a ``single'' command or action at a time which can be a word/sentence recognition at a time or other application specific actions such as delays or noise level. Some systems offer a more sophisticated menu-driven dialogue where several actions are combined in order to proceed rapidly (combination of words, connected words/speech , etc.).
If such an approach is adopted by the technology provider then the application developer has to know how to implement his own application. Usually an application generator is supplied and allows rapid set up and evaluation of applications. Some integrate different ergonomic principles (e.g. management of time outs) and incorporate an error-recovery strategy. Appropriate information should be delivered for that purpose.