Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create prompt_task_map.json #737

Merged
merged 4 commits into from
Mar 26, 2022
Merged

Create prompt_task_map.json #737

merged 4 commits into from
Mar 26, 2022

Conversation

yeganehkordi
Copy link
Contributor

@yeganehkordi yeganehkordi commented Mar 11, 2022

Here is the map of the shared task between our tasks and training datasets of the T0 model.
They have trained the model on the 35 datasets. We have 31 shared tasks and lots of missing tasks that I've listed in the spreadsheet.
Also, for the paws, duorc, amazon_us_reviews, and hotpotqa datasets, our tasks don't have a specific subset. So, we may need to add them again.

@danyaljj danyaljj requested review from Palipoor and yizhongw March 11, 2022 21:26
],
"social_i_qa": [
"task384_socialiqa_question_classification",
"task580_socialiqa_answer_generation"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm ... isn't social_i_qa a question answering task?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if so, I am confused about why it is mapped to an "answer_generation" task.

Copy link
Contributor Author

@yeganehkordi yeganehkordi Mar 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I mapped our tasks to the datasets. They have created the following tasks types from social_i_qa:

  • answer verification (task384_socialiqa_question_classification)
  • multiple-choice question answering (task580_socialiqa_answer_generation)
  • contextual question answering without options (missing task)
  • question generation from the given context and answer (missing task)
    (They have one prompt for answer with option index and one for answer with the option string. I just add one of them as a missing task.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you share the pointers to these?

answer verification (task384_socialiqa_question_classification)
multiple-choice question answering (task580_socialiqa_answer_generation)

Copy link
Contributor Author

@yeganehkordi yeganehkordi Mar 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, answer verification (task384_socialiqa_question_classification) -> data in sheet, task
multiple-choice question answering (task580_socialiqa_answer_generation) -> data in sheet, task
You can see the prompts in Hosted version of PromptSource.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added the prompt name and id as a key in the json files.
Also, I can change this file to a prompt to task map.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added the prompt name and id as a key in the json files.

Maybe I'm missing something here. Does the name of these json files tell us whether they correspond to socialiqa_question_classification or socialiqa_question_classification?

My understanding is that, here we have a 1-to-2 mapping (as opposed to 1-to-1 mapping). It's possible that I am missing something, in which case, help me see it! :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, you're right! I think I can use better architecture here.
Script gets the name of the dataset and generates a task for each prompt, and the name of the json files is based on the dataset and prompt names (e.g., task_socialiqa_Generate_answer).
The mapping is based on the datasets and our tasks and it hasn't mapped the tasks to the prompts. So, I need to add the prompt-task correspondence.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah exactly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the file to map the tasks to the prompt. I'll add T0p and T0pp datasets tmrw.

@Palipoor
Copy link
Contributor

Hi! I am sorry for my absence @danyaljj @yeganehkordi.
This list looks good, do we have a list of task types that we have and they don't? (e.g. task246_dream_question_generation)

@yeganehkordi
Copy link
Contributor Author

Hi! I am sorry for my absence @danyaljj @yeganehkordi. This list looks good, do we have a list of task types that we have and they don't? (e.g. task246_dream_question_generation)

Yes, The list of the missing tasks is in the spreadsheet (based on the dataset and prompt name).
A list of the tasks that we have is in this Json file.

@yizhongw
Copy link
Contributor

@yeganehkordi Question: are you including only the T0 training datasets here? It seems some common datasets (e.g., COPA, BoolQ, WSC, etc.) are missing. Those are used for training T0p, T0pp as well as for evaluation. Could we add them?

@yeganehkordi
Copy link
Contributor Author

@yeganehkordi Question: are you including only the T0 training datasets here? It seems some common datasets (e.g., COPA, BoolQ, WSC, etc.) are missing. Those are used for training T0p, T0pp as well as for evaluation. Could we add them?

Yes, They are only T0 training datasets. I'll add T0p and T0pp training datasets.

@yizhongw
Copy link
Contributor

Yes, They are only T0 training datasets. I'll add T0p and T0pp training datasets.

Thanks! And also the test tasks mentioned here. We have their mapping before in our working doc, but it's not in a prompt-to-task format.

@yeganehkordi
Copy link
Contributor Author

Yes, They are only T0 training datasets. I'll add T0p and T0pp training datasets.

Thanks! And also the test tasks mentioned here. We have their mapping before in our working doc, but it's not in a prompt-to-task format.

Will do.

Copy link
Contributor

@yizhongw yizhongw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the organization of datasets, NI tasks and PS prompts are much more clear. I left some comments for minor fix. @Palipoor Could you work on the missing datasets?

]
}
],
"hotpot_qa/distractor": [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

T0 casts hotpot_qa as a closed_book qa task. Can we add a hotpot_qa/closed_book placeholder here, and add the task together with other missing tasks?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate more?
They don't have any closed_book prompt. are you suggesting adding this task without prompt?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems they used the kilt version of hotpotqa, according the the list here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I'll add it.

doc/prompt_task_map.json Show resolved Hide resolved
],
"winogrande/winogrande_l": [
],
"winogrande/winogrande_debiased": [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For winogrande, could you confirm that the xl/xs/s/m/l/debiased settings only differ in the training sizes? I think we can remove the other and only keep this debiased setting.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, Their only difference is their sizes.
Sure, will do.

]
}
],
"super_glue/copa": [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does task828_copa_commonsense_cause_effect or task827_copa_commonsense_reasoning correspond to this? We have been using these two tasks for our current evaluation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a slight difference between our task and their tasks. Our task is to choose the completion which is the cause or effect of the first sentence, but they have specified that the completion should be which one of them in the instances.
So, I think our task is more general and probably more difficult.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Are we going to add a task similar to theirs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I will.

],
"wiki_hop/masked": [
],
"adversarial_qa/adversarialQA": [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine to keep all the four settings for adversarial_qa here. But when we add the missing tasks later, please just use this adversarialQA setting. Or, probably we can just drop the other three now to avoid confusion?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know the setting of our task and I'm not sure if it is adversarial_qa or not.
If we don't need all the subsets, we can change the instances of this task and just keep adversarial_qa.

]
}
],
"qasc": [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have qasc tasks right? task040_qasc_question_generation, task041_qasc_answer_generation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but we don't have any shared task with them.

],
"wiqa": [
],
"cosmos_qa": [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that you included the question generation tasks for some QA datasets. This is good because I remember the original T0 also uses such prompts. But for some QA datasets, you didn't include the question generation task (e.g., task023_cosmosqa_question_generation can be included here for cosmos_qa). Can we add all of them if we already have them in our current data?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've only added the tasks that had an equivalent prompt. In this case, we have a question generation task, but they don't have a question generation prompt for cosmos_qa.

@danyaljj
Copy link
Contributor

danyaljj commented Mar 26, 2022

Merging this PR since we iterated over it several times now. If anything is missing let's address in another PR.

@danyaljj danyaljj merged commit bbee421 into allenai:master Mar 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants