diff --git a/neural_seq_qa/.gitignore b/neural_seq_qa/.gitignore index 3cb1a39c52..ab866cc4c4 100644 --- a/neural_seq_qa/.gitignore +++ b/neural_seq_qa/.gitignore @@ -31,3 +31,6 @@ eval.*.txt models* *.log run.sh +test.ann.output.txt.gz +test.ir.output.txt.gz +pre-trained-models/*.gz diff --git a/neural_seq_qa/README.md b/neural_seq_qa/README.md index 52c91a118d..b8cf2d55fc 100644 --- a/neural_seq_qa/README.md +++ b/neural_seq_qa/README.md @@ -79,3 +79,46 @@ where * `MODEL_FILE`: a trained model produced by `train.py`. * `INPUT_DATA`: input data in the same format as the validation/test sets of the WebQA dataset. * `OUTPUT_FILE`: results in the format specified in the WebQA dataset for the evaluation scripts. + +#Pre-trained Models + +We have provided two pre-trained models, one for the validation and test sets with annotated evidence, and one for those with retrieved evidence. These two models are selected according to the performance on the corresponding version of validation set, which is consistent with the paper. + +The models can be downloaded with +```bash +cd pre-trained-models && ./download-models.sh && cd .. +``` + +The evaluation result on the test set with annotated evidence can be achieved by + +```bash +PYTHONPATH=data/evaluation:$PYTHONPATH python infer.py \ + pre-trained-models/params_pass_00010.tar.gz \ + data/data/test.ann.json.gz \ + test.ann.output.txt.gz + +PYTHONPATH=data/evaluation:$PYTHONPATH \ + python data/evaluation/evaluate-tagging-result.py \ + test.ann.output.txt.gz \ + data/data/test.ann.json.gz \ + --fuzzy --schema BIO2 +# The result should be +# chunk_f1=0.739091 chunk_precision=0.686119 chunk_recall=0.800926 true_chunks=3024 result_chunks=3530 correct_chunks=2422 +``` + +And the evaluation result on the test set with retrieved evidence can be achieved by + +```bash +PYTHONPATH=data/evaluation:$PYTHONPATH python infer.py \ + pre-trained-models/params_pass_00021.tar.gz \ + data/data/test.ir.json.gz \ + test.ir.output.txt.gz + +PYTHONPATH=data/evaluation:$PYTHONPATH \ + python data/evaluation/evaluate-voting-result.py \ + test.ir.output.txt.gz \ + data/data/test.ir.json.gz \ + --fuzzy --schema BIO2 +# The result should be +# chunk_f1=0.749358 chunk_precision=0.727868 chunk_recall=0.772156 true_chunks=3024 result_chunks=3208 correct_chunks=2335 +``` diff --git a/neural_seq_qa/index.html b/neural_seq_qa/index.html index a18ebe864b..fbe97eee10 100644 --- a/neural_seq_qa/index.html +++ b/neural_seq_qa/index.html @@ -122,6 +122,49 @@ * `INPUT_DATA`: input data in the same format as the validation/test sets of the WebQA dataset. * `OUTPUT_FILE`: results in the format specified in the WebQA dataset for the evaluation scripts. +#Pre-trained Models + +We have provided two pre-trained models, one for the validation and test sets with annotated evidence, and one for those with retrieved evidence. These two models are selected according to the performance on the corresponding version of validation set, which is consistent with the paper. + +The models can be downloaded with +```bash +cd pre-trained-models && ./download-models.sh && cd .. +``` + +The evaluation result on the test set with annotated evidence can be achieved by + +```bash +PYTHONPATH=data/evaluation:$PYTHONPATH python infer.py \ + pre-trained-models/params_pass_00010.tar.gz \ + data/data/test.ann.json.gz \ + test.ann.output.txt.gz + +PYTHONPATH=data/evaluation:$PYTHONPATH \ + python data/evaluation/evaluate-tagging-result.py \ + test.ann.output.txt.gz \ + data/data/test.ann.json.gz \ + --fuzzy --schema BIO2 +# The result should be +# chunk_f1=0.739091 chunk_precision=0.686119 chunk_recall=0.800926 true_chunks=3024 result_chunks=3530 correct_chunks=2422 +``` + +And the evaluation result on the test set with retrieved evidence can be achieved by + +```bash +PYTHONPATH=data/evaluation:$PYTHONPATH python infer.py \ + pre-trained-models/params_pass_00021.tar.gz \ + data/data/test.ir.json.gz \ + test.ir.output.txt.gz + +PYTHONPATH=data/evaluation:$PYTHONPATH \ + python data/evaluation/evaluate-voting-result.py \ + test.ir.output.txt.gz \ + data/data/test.ir.json.gz \ + --fuzzy --schema BIO2 +# The result should be +# chunk_f1=0.749358 chunk_precision=0.727868 chunk_recall=0.772156 true_chunks=3024 result_chunks=3208 correct_chunks=2335 +``` + diff --git a/neural_seq_qa/pre-trained-models/download-models.sh b/neural_seq_qa/pre-trained-models/download-models.sh new file mode 100755 index 0000000000..6dc4ce6606 --- /dev/null +++ b/neural_seq_qa/pre-trained-models/download-models.sh @@ -0,0 +1,17 @@ +#!/bin/bash +if [[ -f params_pass_00010.tar.gz ]] && [[ -f params_pass_00021.tar.gz ]]; then + echo "data exist" + exit 0 +else + wget -c http://cloud.dlnel.org/filepub/?uuid=d9a00599-1f66-4549-867b-e958f96474ca \ + -O neural_seq_qa.pre-trained-models.2017-10-27.tar.gz +fi + +if [[ `md5sum -c neural_seq_qa.pre-trained-models.2017-10-27.tar.gz.md5` =~ 'OK' ]] ; then + tar xf neural_seq_qa.pre-trained-models.2017-10-27.tar.gz + rm neural_seq_qa.pre-trained-models.2017-10-27.tar.gz +else + echo "download data error!" >> /dev/stderr + exit 1 +fi + diff --git a/neural_seq_qa/pre-trained-models/neural_seq_qa.pre-trained-models.2017-10-27.tar.gz.md5 b/neural_seq_qa/pre-trained-models/neural_seq_qa.pre-trained-models.2017-10-27.tar.gz.md5 new file mode 100644 index 0000000000..209d317a35 --- /dev/null +++ b/neural_seq_qa/pre-trained-models/neural_seq_qa.pre-trained-models.2017-10-27.tar.gz.md5 @@ -0,0 +1 @@ +77339985bab7ba173e2f368d9f9d684b neural_seq_qa.pre-trained-models.2017-10-27.tar.gz