
Seventh
Sense Software has developed a state-of-the-art Unicode-compliant multilingual
machine translation system which has the potential to translate
documents to and from any source and target language pair.
The system uses the transfer method to ensure accurate translations, and possesses sufficient versatility that its potential applications go well beyond machine translation to fields as diverse as DNA sequence analysis or cryptography and cryptanalysis.
A prototype system which will translate documents to and from English, Arabic and Bengali in any direction is under preparation with the help and guidance of linguistic experts and professional translators in all three languages. A screenshot showing the system translating a simple sentence from English to Bengali is viewable here.
The key to the power of our translation system is a set of new mathematical structures simply referred to as "units" which have been developed in-house which are simple yet powerful enough to combine the standard components of a traditional MT system (morphology, part-of-speech tagging, syntax, semantics, transfer, generation) into a single unified underlying mathematical formalism.
It is our aim to eventually develop a system which can automatically discover patterns, or equivalently, derive the underlying "grammars" which govern generic sequences (be they DNA sequences or textual documents written in previously undeciphered ancient languages), in the hope of gaining insights into their origin, development and meaning.
[Course announcement:
C++ Programming for Scientists
and Engineers]
[Discussion group: http://groups.yahoo.com/group/seventh_sense]