Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create prompt_task_map.json #737

Merged
merged 4 commits into from
Mar 26, 2022
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
130 changes: 130 additions & 0 deletions doc/prompt_task_map.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,130 @@
{
"commonsense_qa": [
"task073_commonsenseqa_answer_generation"
],
"dream": [
"task247_dream_answer_generation"
],
"quail": [
"task887_quail_answer_generation"
],
"quartz": [
],
"social_i_qa": [
"task384_socialiqa_question_classification",
"task580_socialiqa_answer_generation"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm ... isn't social_i_qa a question answering task?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if so, I am confused about why it is mapped to an "answer_generation" task.

Copy link
Contributor Author

@yeganehkordi yeganehkordi Mar 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I mapped our tasks to the datasets. They have created the following tasks types from social_i_qa:

  • answer verification (task384_socialiqa_question_classification)
  • multiple-choice question answering (task580_socialiqa_answer_generation)
  • contextual question answering without options (missing task)
  • question generation from the given context and answer (missing task)
    (They have one prompt for answer with option index and one for answer with the option string. I just add one of them as a missing task.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you share the pointers to these?

answer verification (task384_socialiqa_question_classification)
multiple-choice question answering (task580_socialiqa_answer_generation)

Copy link
Contributor Author

@yeganehkordi yeganehkordi Mar 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, answer verification (task384_socialiqa_question_classification) -> data in sheet, task
multiple-choice question answering (task580_socialiqa_answer_generation) -> data in sheet, task
You can see the prompts in Hosted version of PromptSource.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added the prompt name and id as a key in the json files.
Also, I can change this file to a prompt to task map.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added the prompt name and id as a key in the json files.

Maybe I'm missing something here. Does the name of these json files tell us whether they correspond to socialiqa_question_classification or socialiqa_question_classification?

My understanding is that, here we have a 1-to-2 mapping (as opposed to 1-to-1 mapping). It's possible that I am missing something, in which case, help me see it! :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, you're right! I think I can use better architecture here.
Script gets the name of the dataset and generates a task for each prompt, and the name of the json files is based on the dataset and prompt names (e.g., task_socialiqa_Generate_answer).
The mapping is based on the datasets and our tasks and it hasn't mapped the tasks to the prompts. So, I need to add the prompt-task correspondence.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah exactly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the file to map the tasks to the prompt. I'll add T0p and T0pp datasets tmrw.

],
"wiqa": [
],
"cosmos_qa": [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that you included the question generation tasks for some QA datasets. This is good because I remember the original T0 also uses such prompts. But for some QA datasets, you didn't include the question generation task (e.g., task023_cosmosqa_question_generation can be included here for cosmos_qa). Can we add all of them if we already have them in our current data?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've only added the tasks that had an equivalent prompt. In this case, we have a question generation task, but they don't have a question generation prompt for cosmos_qa.

"task024_cosmosqa_answer_generation"
],
"qasc": [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have qasc tasks right? task040_qasc_question_generation, task041_qasc_answer_generation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but we don't have any shared task with them.

],
"quarel": [
"task1378_quarel_correct_answer_generation"
],
"sciq": [
"task591_sciq_answer_generation"
],
"wiki_hop/original": [
],
"wiki_hop/masked": [
],
"adversarial_qa/adversarialQA": [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine to keep all the four settings for adversarial_qa here. But when we add the missing tasks later, please just use this adversarialQA setting. Or, probably we can just drop the other three now to avoid confusion?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know the setting of our task and I'm not sure if it is adversarial_qa or not.
If we don't need all the subsets, we can change the instances of this task and just keep adversarial_qa.

],
"adversarial_qa/dbidaf": [
],
"adversarial_qa/dbert": [
],
"adversarial_qa/droberta": [
],
"quoref": [
"task002_quoref_answer_generation"
],
"duorc/SelfRC": [
"task194_duorc_answer_generation",
"task193_duorc_question_generation"
],
"duorc/ParaphraseRC": [
"task194_duorc_answer_generation",
"task193_duorc_question_generation"
],
"ropes": [
"task061_ropes_answer_generation"
],
"hotpot_qa/distractor": [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

T0 casts hotpot_qa as a closed_book qa task. Can we add a hotpot_qa/closed_book placeholder here, and add the task together with other missing tasks?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate more?
They don't have any closed_book prompt. are you suggesting adding this task without prompt?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems they used the kilt version of hotpotqa, according the the list here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I'll add it.

"task170_hotpotqa_answer_generation",
"task192_hotpotqa_sentence_generation",
"task191_hotpotqa_question_generation"
],
"hotpot_qa/fullwiki": [
"task170_hotpotqa_answer_generation",
"task192_hotpotqa_sentence_generation",
"task191_hotpotqa_question_generation"
],
"wiki_qa": [
],
"common_gen": [
"task102_commongen_sentence_generation"
],
"wiki_bio": [
],
"amazon_polarity": [
"task493_review_polarity_classification"
],
"amazon_reviews_multi/en": [
"task618_amazonreview_summary_text_generation",
"task1310_amazonreview_rating_classification",
"task617_amazonreview_category_text_generation"
],
"amazon_us_reviews/Wireless_v1_00": [
"task1342_amazon_us_reviews_title"
],
"imdb": [
"task284_imdb_classification"
],
"rotten_tomatoes": [
"task888_reviews_classification"
],
"yelp_polarity": [
"task475_yelp_polarity_classification"
],
"yelp_review_full": [
],
"cnn_dailymail/3.0.0": [
"task1553_cnn_dailymail_summarization"
],
"gigaword": [
"task288_gigaword_summarization"
],
"multi_news": [
],
"samsum": [
"task1572_samsum_summary"
],
"xsum": [
],
"ag_news": [
"task1541_agnews_classification"
],
"dbpedia_14": [
"task629_dbpedia_14_classification",
"task633_dbpedia_14_answer_generation"
],
"trec": [
],
"glue/mrpc": [
],
"paws/labeled_final": [
"task400_paws_paraphrase_classification"
],
"paws/labeled_swap": [
"task400_paws_paraphrase_classification"
],
"paws/unlabeled_final": [
"task400_paws_paraphrase_classification"
],
"glue/qqp": [
]
}