-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: document retrieval metrics for non-document_id document_relevance_criteria #3885
Conversation
@tstadel please rebase to |
@@ -1690,6 +1690,7 @@ def _build_eval_dataframe( | |||
df_docs.map_rows = partial(df_docs.apply, axis=1) | |||
df_docs.rename(columns={"id": "document_id", "content": "context"}, inplace=True) | |||
df_docs["gold_document_ids"] = [gold_document_ids] * len(df_docs) | |||
df_docs["gold_answers"] = [gold_answers] * len(df_docs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added for easier analysis, e.g. if document_relevance_criterion="answer"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to clarify this won't be a problem if there are no gold_answers
in the eval set? For example in the case we use context
for the relevance criteria gold_answers may not be available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly. I've added tests for this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @tstadel this looks good to me! I don't fully understand how the code is structured, but I appreciate that the tests for the metrics look correct now.
I agree, the current code can be better structured. As there are no other using parts besides calculate metrics, refactoring it, wouldn't bring too much value. Let's refactor in another PR, probably connecting it to a feature to support custom metrics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @tstadel this looks good to me! I don't fully understand how the code is structured, but I appreciate that the tests for the metrics look correct now.
…e_criteria (#3885) * fix document retrieval metrics for all document_relevance_criteria * fix tests * fix eval_batch metrics * small refactorings * evaluate metrics on label level * document retrieval tests added * fix pylint * fix test * support file retrieval * add comment about threshold * rename test
Related Issues
Proposed Changes:
document_relevance_criteria
does not containdocument_id
document_relevance_criteria
does not containdocument_id
and there are multiple labelsHow did you test it?
Notes for the reviewer
Checklist
fix:
,feat:
,build:
,chore:
,ci:
,docs:
,style:
,refactor:
,perf:
,test:
.