âTyDiQA: A Benchmark for Information-Seeking Question Answering in Typologically Diverse Languagesâ, 2020-03-10 (; backlinks)â :
Confidently making progress on multilingual modeling requires challenging, trustworthy evaluations. We present TyDiQAâa question answering dataset covering 11 typologically diverse languages with 204K question-answer pairs.
The languages of TyDiQA are diverse with regard to their typologyâthe set of linguistic features each language expressesâsuch that we expect models performing well on this set to generalize across a large number of the worldâs languages.
We present a quantitative analysis of the data quality and example-level qualitative linguistic analyses of observed language phenomena that would not be found in English-only corpora.
To provide a realistic information-seeking task and avoid priming effects, questions are written by people who want to know the answer, but donât know the answer yet, and the data is collected directly in each language without the use of translation.