NLP-KG
Semantic Search

Publication:

MultiMUC: Multilingual Template Filling on MUC-4

William GanttShabnam BehzadHannah YoungEun AnYunmo ChenAaron Steven WhiteBenjamin Van DurmeM. Yarmohammadi • @European Chapter of the Association for Computational Linguistics • 29 January 2024

TLDR: Translations of the classic MUC-4 template filling benchmark into five languages: Arabic, Chinese, Farsi, Korean, and Russian are introduced, comprising translations from a strong multilingual machine translation system and manually project the original English annotations into each target language.

Citations: 2
Abstract: We introduce MultiMUC, the first multilingual parallel corpus for template filling, comprising translations of the classic MUC-4 template filling benchmark into five languages: Arabic, Chinese, Farsi, Korean, and Russian. We obtain automatic translations from a strong multilingual machine translation system and manually project the original English annotations into each target language. For all languages, we also provide human translations for key portions of the dev and test splits. Finally, we present baselines on MultiMUC both with state-of-the-art template filling models for MUC-4 and with ChatGPT. We release MUC-4 and the supervised baselines to facilitate further work on document-level information extraction in multilingual settings.

Related Fields of Study

loading

Citations

Sort by
Previous
Next

Showing results 1 to 0 of 0

Previous
Next

References

Sort by
Previous
Next

Showing results 1 to 0 of 0

Previous
Next