Bilingual Tabular Inference: A Case Study on Indic Languages

Chaitanya Agarwal, Vivek Gupta, Anoop Kunchukuttan, Manish Shrivastava • @North American Chapter of the Association for Computational Linguistics • 01 January 2022

TLDR: This study constructs EI-InfoTabS: an English-Indic bTNLI dataset by translating the textual hypotheses of the English TNLI dataset InfoTabS into eleven major Indian languages and thoroughly investigates how pre-trained multilingual models learn and perform on Ei-Info TabS.

Citations: 1

Abstract: Existing research on Tabular Natural Language Inference (TNLI) exclusively examines the task in a monolingual setting where the tabular premise and hypothesis are in the same language. However, due to the uneven distribution of text resources on the web across languages, it is common to have the tabular premise in a high resource language and the hypothesis in a low resource language. As a result, we present the challenging task of bilingual Tabular Natural Language Inference (bTNLI), in which the tabular premise and a hypothesis over it are in two separate languages. We construct EI-InfoTabS: an English-Indic bTNLI dataset by translating the textual hypotheses of the English TNLI dataset InfoTabS into eleven major Indian languages. We thoroughly investigate how pre-trained multilingual models learn and perform on EI-InfoTabS. Our study shows that the performance on bTNLI can be close to its monolingual counterpart, with translate-train, translate-test and unified-train being strongly competitive baselines.

Related Fields of Study

1 Citations No References

Citations

Sort by

Showing results 1 to 0 of 0

References

Sort by

Showing results 1 to 0 of 0