What is the difference between dataset V1 and V2? #5

HongyuLi2018 · 2019-05-29T14:17:14Z

I noticed that there is an update in MRQA datasets. Is it possible to provide some details about the changes? Thanks!

ajfisch · 2019-05-29T14:24:40Z

Yep, absolutely. The change is minor:

The issue: We identified that in one of the dataset creation scripts not all of the detected_answers were kept. This means that if an answer had more than one possible answer (e.g. Super Bowl vs Super Bowl 50) only occurrences (start/end token) of one were recorded. Note that answers, the original set of answers from which detected_answers are derived, is unchanged.

Effects: This only affected datasets that had more than one annotated gold span type per question for training. The full set of answers which the evaluation script uses was unaffected, so baseline results don't change. Visualization was affected; not all the true answer options were shown.

Update: The updated v2 datasets are exactly the same, just with their corrected detected_answers lists.

Thanks for the patience!

danqi closed this as completed Jun 17, 2019

ZHO9504 mentioned this issue Jul 20, 2019

Error found in validating when use 2 gpu(But it'ok when using one gpu ).. #17

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What is the difference between dataset V1 and V2? #5

What is the difference between dataset V1 and V2? #5

HongyuLi2018 commented May 29, 2019

ajfisch commented May 29, 2019

What is the difference between dataset V1 and V2? #5

What is the difference between dataset V1 and V2? #5

Comments

HongyuLi2018 commented May 29, 2019

ajfisch commented May 29, 2019