Computer scientists create a multilingual search engine
- Date:
- March 18, 2010
- Source:
- Facultad de Informática de la Universidad Politécnica de Madrid
- Summary:
- Researchers have developed a multilingual search engine to query a contents repository written in Interlingua using questions formulated in any language. The search engine returns a precise answer in the language in which the question was formulated.
- Share:
Researchers from the Validation and Business Applications Group (VAI) at the Universidad Politécnica de Madrid's School of Computing have developed a multilingual search engine to query a contents repository written in Interlingua using questions formulated in any language. The search engine returns a precise answer in the language in which the question was formulated.
Interlingua is a language-independent contents representation. The United Nations' Universal Networking Language (UNL) is the only living, general-purpose Interlingua specified by standards, handbooks and governing organizations. UNL was created to break Internet language barriers. The VAI is the UNL support group for the Spanish language.
The multilingual search engine is a question answering system that aims to return precise answers to questions about facts formulated in the user's mother tongue, e.g. Who discovered America and in which year? The answer to this question would be Christopher Columbus in 1492.
Information in UNL
The novelty of this system is that the question can be formulated in English, French, Spanish or any other language, and the system will return an answer formulated in this same language without any translation from source to target languages, because the information base that the system searches is written in UNL.
Supposing that the answer is implicit in the question, the system exploits the features of the UNL representation of the user's question to find the answer. The search engine works by deducing the answer from the question rather than finding the answer to the question.
The search engine is responsible for searching the text corpus written UNL to find the answer as follows. First, it searches the text corpus for statements that could contain the answer. Second, it determines which of this set of statements contains the answer, and what the answer is. It then generates the answer in the same language that the question was formulated in.
In response to the question, Why was Aubert awarded the Caméré prize?, for example, the search engine searches the repository and locates a graph. From the graph, it deduces the answer to the question, i.e. for a new type of movable dam.
Promising results
Researchers used the UNESCO's biographical encyclopaedia as an information base for the exercise concerning the French engineer Jean Aubert (1894-1984). This encyclopaedia has 25 articles, which have been translated to UNL and contain 101 UNL expressions and 2534 universal words.
The results of this research -82% precise answers- are very promising. A total of 75 different questions (when, how, who) were formulated, to which the right answer was known beforehand. Other questions for which the repository contained no answer were formulated to examine system behaviour in such cases. The results confirm the validity of this search engine for developing multilingual question answering systems.
The findings of this research, developed by Jesús Cardeñosa (VAI director), Carolina Gallardo and Miguel A. de la Villa, were presented at the 8th International Conference FQAS 2009 (Denmark, October 2009) and were published in Lecture Notes in Artificial Intelligence 5822, Springer, Berlin, September 2009, pp 500 ff.
Story Source:
Materials provided by Facultad de Informática de la Universidad Politécnica de Madrid. Note: Content may be edited for style and length.
Cite This Page: