GSA: WordAlign

GSA: WordAlign

Alignment of word sequences

Dafydd Gibbon
V04α 2014-09-03

Method

The longest sentences in the corpus (or an arbitrary longest sentence if there are more than one of the same length) is selected as a reference sequence. The target sequences in the corpus are then aligned with the shorter or equal reference sentence.

Any item in the reference sequence which is not in the target item (or vice versa) is marked by a gap marker, with the option of having all mismatches marked by an underscore '_' in the appropriate position, or of having a different marker for each different mismatch context type. Special cases are noted in the GSA-WordAlign output.

There are certain limitations in the current version, which lead to one case not being handled as expcected. This is also explained in the notes of the GSA-WordAlign output.


Plain display mode:
Detailed display mode:

JB &-filter on:
JB &-filter off:




Paste sentences here:

Dafydd Gibbon, 2014-08-31