Technologies available for licensing, investment, joint ventures and R&D contracts in Ukraine, Azerbaijan, Georgia, Moldova, Uzbekistan

Frontpage Slideshow (version 2.0.0) - Copyright © 2006-2008 by JoomlaWorks

Google Translate

Patenting Information

Tag Cloud



Free counters!

Home Moldova Technologies Romanian Reusable Resources for Natural Language Technology

Romanian Reusable Resources for Natural Language Technology

Description

The Romanian Reusable Resources for Natural Language Technology (3RNLT) consist of a database with the linguistic information for Romanian at the word level, and a set of service programs. The database has 6 main tables and 16 auxiliary tables. The main tables are words, words_engl, words_rus, word_flexies, word_synonyms, word_translations. The former three tables contain respectively Romanian, English, and Russian words. Table word_flexies contains all inflected forms of the words from the table word. The table word_synonyms represents dictionary of synonyms for Romanian, and table word_translations contains Romanian-English and Romanian-Russian translations. The auxiliary tables contain morphological information, information about part of speech, domains, syntactic roles, etc. At present the database contains about 1 million of words.

Innovative Aspect and Main Advantages


Fig. 1 RomSP – texts spellchecker for Romanian

For automation of the process of 3RNLT database completion the inflexion programs has been elaborated, that can be used in static or dynamic way. The methods for database correctness and integrity verification has been elaborated. These methods can be: automated – based on formal properties of 3RNLT, and semiautomated – with expert involvement. In this case special resources visualization engines, orthographic correctors, parallel dictionaries are used.

Areas of Application

A software intended to generate dictionaries starting from 3RNLP that organizes the content information is realized in PHP. The user has to specify criteria for selection of the words which will be included into the dictionary, the size of dictionary page, the dictionary type (synonyms, translations, morphology) and a template should be defined for the HTML layout.

A Web search engine has been elaborated with the possibilities to involve some particularities of Romanian morphology. Using computational lexicon for Romanian from 3RNLT the search can be carried out not only for one word apart, but for all its derivative forms as well.


Fig. 2 Interface of the computer aided learning system for Romanian

The texts spellchecker for Romanian, RomSP, has been elaborated with possibilities to supplement the resources and an original algorythm of suggestions search for erroneous words. A computer aided learning system for Romanian has been elaborated. The system provides Romanian morphology study for three users categories: 1) persons who do not know grammar and possess poor vocabulary; 2) persons who understand spoken language and have a non-systematized knowledge of grammar; 3) persons who know Romanian, but want to deepen their knowledge on grammar, linguistic features, derivation of words, etc.

Stage of Development

The functioning version of 3RNLT and several applications are accessible at www.math.md/elrr.

Contact Details

Institute of Mathematics and Computer Science Academy of Sciences of Moldova
Contact person: Constantin Ciubotaru
Address: 5, Academiei str., Chisinau, MD-2028, Republic of Moldova
Tel: +(373) 22 73-80-73
Fax: +(373) 22 73-80-27
E-mail: This e-mail address is being protected from spambots. You need JavaScript enabled to view it

Add comment


Security code
Refresh