Computing Semantic Representations
with Referent Systems
Version 6.1 Date: Tuesday, July 12, 2016
Referent systems are designed to allow for calculation of
semantic representations bottom up.
This project page is about referent systems. It consists
of two parts:
The manuscript also contains an introduction into the system so that
one can effectively work hand in hand with both theory and implementation.
- Source Code
- Installation
- Tk-Interface
- Changelog
- Archive
- To Do
- Foreign Language Support
- Other Platforms
- Acknowledgments
Source Code
(The latest release.)
The version 6.1 comes with source code. You may use it subject
to GNU license terms. Use and modification of this software is
permitted. We decline any responsibility connected with the use
of this software. Notice also that what we describe below will
work under Unix only (which includes Mac OS X and Linux). If
you insist on using Windows, you should probably install Cygwin
first. (Read also the howto manual linked to below).
Back to Top
Installation
If you want a standalone version to play with, here it is.
The minimal version requires
- Operating System: Linux, Unix or Mac OS X or newer,
- LaTeX,
- Tcl/Tk
- OCaml (current version is 4.01.0, but older ones probably
do just as fine). Before installing, make sure you have
Tcl/Tk installed so that the graphics works as well.
In addition, the following modules should be installed.
- findlib
- Xml-light
- Camomile 0.7.1.
Camomile is an extension module for OCaml for locales.
These modules or packages can be obtained from
Inria.
To install the program, download the file
archiv/referent_v6-0.tar.
It contains the following items:
- The Manual "howto.pdf". This does not explain
the theory, but it tells you about the installation
and the use of the program. It is more comprehensive
than this page, for example.
- The ocaml source files.
- The shell scripts. Invoke the scripts with the
option "-h" to get some basic help. These are mainly
the following:
hilfe
, to get fast help,
setup
, to set the system
up (default language is German, so better call with option "-h"
beforehand).
compile
, to compile the
program.
ref
to initiate the
user interface of the program.
wrap
, to be used to create
an archive.
save
to create backups.
- Some dictionaries, among which a dictionary of Hungarian nominal
inflection and the Latin verbs and nouns, which are pretty comprehensive.
Put the archive into a directory where you want to install the
entire system, say, <RefSys>
. Unpack the archive using
tar xvf referent_v6-1.tar
You should have directories dict
and bin
.
dict
is where the dictionaries have to be put,
bin
is for the binaries. To get
the English installation type
setup -p -l en
This makes the files in bin
executable. Setup also adds directories, and adds
<RefSys>/bin
to your path (via the option "-p"). Option "-l en" makes the
system use English. (If you do not set your paths, you will
have to prepend each command by ./bin/
!) It is best to add this line to your
.bashrc
, otherwise you will
have to retype it everytime you restart the computer.
Now type
compile -r
This will compile the system. The errors (if we made any ...) are
redirected into compile.log
.
You are told whether the installation is successful. If not, take a
look at compile.log
to see what
went wrong. Finally, type the magic incantation:
ref
Back to Top
Tk-Interface
RefSys uses a graphical interface. This should be easy to use,
but we am sure there is lot of room for improvement. There is
a dialog session. In the bottommost white window you can
type commands, the most useful of which is "help". No need
to use quotation marks. Hit return and wait for the next
thing to happen.
Back to Top
Changelog
Version 3
The main change is in the modularisation of the program. There are
also intrinsic changes. The new version separates the syntax and
the semantics in the initial parsing stage. During parsing, it only
computes the set of viable parse terms. When it is finished, it
unravels the viable parse terms into real entries and shows them
to the user. Another change is that it now allows for polyadic merge,
so it can handle complex predicates correctly.
Version 4
Apart from bug fixes, the new version creates an interactive
web page, where users can upload dictionaries and enter
sentences. The upload is done in several stages. For
security reasons the system will check whether the file
looks like a dictionary. If it does not comply with the
rules give in the manual, it will
not upload. After uploading it will create an executable
(if possible) and rehash the pages. A new page is created
where one can enter strings in the new dictionary.
Entering strings is done by clicking on the items (to avoid
awkward keyboard issues for foreign languages).
Back to Top
Version 5.0
The most visible change is the increased support for morphology.
Entries have a morphology; this is a morpheme, where a
morpheme is a set of morphs. Morphs in turn are complex
structures, allowing the use of several strings and features
for strings. This allows to treat plural in English, for
example, using one entry only, so the semantics need not
be iterated. The morphological decomposition tables used
in the previous version are now redundant (they were also
a source of exponential slowdown).
The Tcl-script is much simplified.
Version 5.1
It is not possible to input non-ASCII symbols with the
standalone system using a standard keyboard. The input
is through specified sequences, like html, but the coding
is flexible.
The exponents are now arrays of so-called glued strings.
These are strings with optional conditions what kinds of
strings they can be concatenated with. The conditions are of
the form "concatenates with a string that has/does not have
a suffix suf if appended (prefix pref if
prepended)". Variables are now pairs (string, number) and
most string output uses buffers, to speed up the output.
Version 5.2
This version is a drastic change from 5.1. There is no
standalone version any more, as the graphical interface has
become much easier to use. It uses a stack where things can
be put and manipulated, and a user dialog box where commands
can be entered. Also, we have become frustrated with the
dynamic linking tool of OCaml and have not been able to
get it to work. Finally, we decided to use an xml-style
data format and wrote our own tools to parse from such
data and write to it. The web versions are at present not
supported and the documentation lags behind seriously.
Version 5.4
New features are:
- Data is strictly Xml, and we use
xml-light
to parse it.
- The dictionaries may contain also empty strings, but to
enhance efficiency we have added a rank function. Every morph
has an optional rank. Adding empty strings must increase
the rank. By default, adding a nonempty morph creates a string
of lowest rank, adding an empty morph creates a string of
maximal rank.
- The compile routine creates
.ocamlinit
automatically. Thus calling ocaml
will start the system directly.
- There are method for adding dictionary entries using
a graphical interface.
Version 5.7
New features are:
- The display methods have been overhauled. The command
"show all" allows to view all items that have been stored
in various tables. They can be clicked to view.
- The graphical entry creator has been simplified.
- There is a dictionary of Hungarian nouns. It is not
complete in the sense that some suffixes are missing,
but it demonstrates well the power of the system.
Version 5.8/5.9
- Bugs have been fixed (especiall the substitution algorithm).
- Many nontrivial dictionaries have been added. Commentary
is included.
Version 6.0
- Bugs have been fixed. The algorithm now works fully according to theory,
with no workarounds.
- The handbook explains the algorithm in full detail, everything is being
covered.
- Many nontrivial dictionaries have been added. Commentary
is included.
Version 6.1
- Bugs have been fixed. In particular, there were problems in the
treatment of fusion with unsaturated arguments as well as
the dictionary entries and morphnames.
- A partial dictionary for Latin adjectives and nouns has been
added that needs to be updated. Commentary is included.
Archive
For the curious we have kept the older versions, but we doubt that they
will be of much use. To see them, visit the
archive.
To Do
- (Theory:) Add full support for logical connectives.
- Improve the messaging system.
- Help menus must still be updated.
Back to Top
Foreign Language Support
Interface support is not available other than for English and
German. For making the dictionaries notice that although OCaml
can only handle ISO-Latin-1, you may define strings for your
language in other charsets, too. Tk displays them (it is fully
Unicode compatible). LaTeX is a bit trickier, since the typewriter
font is somewhat incomplete. Apart from that, the way it works is
as follows. For the strings in the language, write them into the
dictionary in UTF-8. They are passed by OCaml to Tk and LaTeX,
which render them assuming UTF-8 encoding throughout. To input
into the Tk-Interface we have built a small converter that
allows you to use a standard keyboard to enter foreign
symbols (a bit like HTML character codes).
Back to Top
The software has been successfully installed both on Unix/Linux
and Mac OS X. For that you can use the
compile
installer. For Windows, the only way
is to install Cygwin before installing everything else. We will
need to find out exactly how this can be done. (If you have any
ideas, please let me know.)
Back to Top
Acknowledgments
Referent systems are due to Kees Vermeulen. We am indebted
to him as well as Albert Visser for the theory part. The
implementation has been done by myself. Its creation has been
sponsored by two successive senate grants from UCLA. We have
enjoyed the help of Cory Hill, Ben Keil,
Adam Skory and Joseph Vaughan.
The project has been joined in 2009 by Udo Klein and
between November 2010 and September 2011 by Sabine GrĂ¼nder.
They were supported by the Alfried Krupp von Bohlen und Halbach-Stiftung.
Send any reports of error (or praise) to
Marcus Kracht.
Back to Top