Evalita 2011 “Domain Adaptation for Dependency Parsing”

We organised the track in the framework of Evalita 2011, the 3rd evaluation campaign of Natural Language Processing and Speech Tools for Italian. It aims at investigating techniques for adapting state-of-the-art dependency parsing systems to new domains, i.e. to domains outside of the data from which they were trained or developed, with two main novelties: the language being dealt with, i.e. Italian, and the target domain, namely the legal domain. Two different sub-tasks have been foreseen:

  1. minimally supervised domain adaptation with limited annotated resources in the target domain and unlabeled corpora;
  2. unsupervised domain adaptation with no annotated resources in the target domain, i.e. using only unlabeled target data.

For detailed documentation about the task, see the Domain Adaptation for Dependency Parsing Track home page.

Download

Click here to download the following files:

  • Source domain training and development data
    • Data for training and testing base parsing systems; includes articles from newspapers exemplifying general language.
  •  Target domain unlabeled data
    • A wide target corpus of Italian legislative texts including automatically generated sentence splitting, tokenization, morpho-syntactic tagging and lemmatization. It does not contain labeled dependency relations.
  • Target domain development data
    • Data for testing in system development; includes gold standard annotation.
  • Target domain gold data
    • Test data with gold standard annotation.

(Note: after filling in the request form, the download link will appear at the bottom of the page.)

References

Dell’Orletta F., Marchi S., Montemagni S., Venturi G., Agnoloni T., Francesconi E. (2012) Domain Adaptation for Dependency Parsing at Evalita 2011 . In Working notes of EVALITA 2011, 24th-25th January, Rome, Italy, ISSN 2240-5186.

Dell’Orletta F., Marchi S., Montemagni S., Venturi G., Agnoloni T. e Francesconi E. (2013), Domain Adaptation for Dependency Parsing at Evalita 2011. In Magnini B., Cutugno F., Falcone M., Pianta E. (eds.), Evaluation of Natural Language and Speech Tool for Italian, LNCS–LNAI, Vol. 7689, Springer–Verlag Berlin Heidelberg, pp. 58–69.