Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Narrative QA Manual #1778

Merged
merged 3 commits into from
Jan 29, 2021
Merged

Conversation

rsanjaykamath
Copy link
Contributor

Submitting the manual version of Narrative QA script which requires a manual download from the original repository

@rsanjaykamath
Copy link
Contributor Author

@lhoestq sorry I opened a new pull request because of some issues with the previous code base. This pull request is originally from #1364

Copy link
Member

@lhoestq lhoestq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really cool thanks ! And good dataset card as well :)

I left a few comments

datasets/narrativeqa_manual/README.md Outdated Show resolved Hide resolved
datasets/narrativeqa_manual/narrativeqa_manual.py Outdated Show resolved Hide resolved
datasets/narrativeqa_manual/narrativeqa_manual.py Outdated Show resolved Hide resolved
datasets/narrativeqa_manual/narrativeqa_manual.py Outdated Show resolved Hide resolved
datasets/narrativeqa_manual/narrativeqa_manual.py Outdated Show resolved Hide resolved
@rsanjaykamath
Copy link
Contributor Author

Excellent comments. Thanks for those valuable suggestions. I changed everything as you have pointed out :)

Copy link
Member

@lhoestq lhoestq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for moving all the processing to _generate_examples !

I was wondering whether we should use the same features format as in the narrative_qa dataset script (same column names, same structure) for consistency. Let me know what you think

datasets/narrativeqa_manual/narrativeqa_manual.py Outdated Show resolved Hide resolved
datasets/narrativeqa_manual/README.md Show resolved Hide resolved
datasets/narrativeqa_manual/README.md Show resolved Hide resolved
@rsanjaykamath
Copy link
Contributor Author

I've copied the same template as NarrativeQA now. Please let me know if this is fine.

Copy link
Member

@lhoestq lhoestq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome thank you !!
This looks all good :)

Just before we merge, I was wondering if you knew why the number of examples in the train set went from 1102 to 32747 in your last commit ? I can't see why the changes in the code would cause such a big difference

@rsanjaykamath
Copy link
Contributor Author

rsanjaykamath commented Jan 28, 2021

Awesome thank you !!
This looks all good :)

Just before we merge, I was wondering if you knew why the number of examples in the train set went from 1102 to 32747 in your last commit ? I can't see why the changes in the code would cause such a big difference

Ok the change was the way I presented the data.
In my previous code, I presented a story with a list of questions-answers related to the story per sample. So the total 1102 was the number of stories (not questions) in the train set.

In the case of NarrativeQA, the code presented each sample data with one single question. So the story gets replicated as many times based on number of questions per story. I felt this was not really memory efficient so I had coded the way I did earlier.

But since this would be inconsistent as you pointed out, I modified my code to suit the NarrativeQA approach. Hope it's clear now :)

@lhoestq
Copy link
Member

lhoestq commented Jan 29, 2021

Ok I see ! that makes sense

Copy link
Member

@lhoestq lhoestq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this one :)

@lhoestq lhoestq merged commit 3e452da into huggingface:master Jan 29, 2021
@rsanjaykamath
Copy link
Contributor Author

Thanks for your time and helping me with all this :) Really appreciate the hardwork you guys do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants