Platform for Automatic, Normalized Annotation and Cost-Effective Acquisition of Language Resources for Human Language Technologies

PANACEA

http://www.panacea-lr.eu/

The objective of PANACEA is to develop an infrastructure for combining language technologies (LRs) that will focus in the automatic production of the huge Language Resources needed by modern Machine Translation and Natural Language Processing applications. To this end, one of the project’s outcomes will be a factory that will automate all stages involved in the acquisition, production, updating and maintenance of LRs, so that these resources can be effectively used in different language pairs, genres and domains. The LRs to be produced for the evaluation of the PANACEA factory will include monolingual and bilingual corpora and dictionaries. Reductions in cost, time and human effort are expected to contribute significantly in overcoming the language barriers Europe has to deal with today.

Main objectives:

Creation of an open web service-based platform for easy designing of workflows focusing on building LRs automatically
Development of techniques for monolingual and parallel corpora acquisition and processing
Use of sentential and sub-sentential aligned data for deriving bilingual dictionaries and extracting transfer grammars
Development of techniques for automatic acquisition of subcategorization frames, selectional preferences, multiword expressions and lexical-semantic classes

Status

Completed

Start Date

01/2010

End Date

12/2013

Responsible

Prokopis Prokopidis