Ambiguity & Connotation: Translating without a Dictionary


Word:

English to German German to English
English to Latin Latin to English
English to French French to English


This work grew from musings about the processes by which infants or people thrust into an unfamiliar culture acquire language.

Language acquisition and translation involves mapping between systems of meaning, whether between percepts of different sensory modes, between direct experience and symbols, or between sets of verbal encodings.

One aspect of such mappings is that the boundaries and clustering of meanings in each system may be quite different. In human languages this is most apparent in prepositions, in systems of homonyms and in alternative meanings for individual words.

The existence of homonyms, ambiguity and multiple connotations and meanings for individual words reflects the predicament of encoded language in general: a finite lexicon of discrete words is required to represent a continuous field of infinite meanings. In order to speak in words and sentences of manageable length, with a finite vocabulary, individual words must carry multiple meanings.

Ambiguity is intrinsic to language.


This algorithm implements a mapping between languages based on pairs of sentences with the same meaning: the knowledge base consists entirely of translated sentence pairs. The algorithm is symmetrical in the sense that translations can be made across both languages from a single collection of data. The meaning of the individual words in each sentence are unknown and derived entirely from context. The number and sequence of words in sentences with identical meanings usually differs between languages.

Over the course of time, as more sentences are translated and re-entered into the database, the translation algorithm becomes "smarter" and better able to resolve ambiguous words.


Translation Bases:
Latin/English internlinear translation: 446 sentences. (Catholic Mass)
German/English translation: 3682 sentences (Short Story)
French/English translation: 342 sentences (MacLibel + Ads)

23-Apr-96 Mark Thompson