You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The issue: We identified that in one of the dataset creation scripts not all of the detected_answers were kept. This means that if an answer had more than one possible answer (e.g. Super Bowl vs Super Bowl 50) only occurrences (start/end token) of one were recorded. Note that answers, the original set of answers from which detected_answers are derived, is unchanged.
Effects: This only affected datasets that had more than one annotated gold span type per question for training. The full set of answers which the evaluation script uses was unaffected, so baseline results don't change. Visualization was affected; not all the true answer options were shown.
Update: The updated v2 datasets are exactly the same, just with their corrected detected_answers lists.
I noticed that there is an update in MRQA datasets. Is it possible to provide some details about the changes? Thanks!
The text was updated successfully, but these errors were encountered: