-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add SentenceTransformersRanker with pre-trained Cross-Encoder #1209
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the separation makes sense (at least for now). We might combine them later on and rather add an arg similarity_type
to differentiate between the two different approaches.
Please add some more documentation (see comments) and a basic test case for both rankers that ensures the expected scores / sorting of some dummy docs (FARM + sentencetransformers).
@tholor Thank you for your feedback! I made the requested changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good! Thx for the changes.
I added a minor sentence to the docstring (we should keep in mind that some user might not know what Re-ranking is and explain at least very briefly the "value / use case".
Also added the import to the init so that we can just import from haystack.ranker import SentenceTransformersRanker
similar to our other building blocks.
In contrast to FARMRanker, SentenceTransformerRanker uses the logit as similarity score and not the classifier's probability of label "1"
see example here: https://www.sbert.net/docs/pretrained-models/ce-msmarco.html#usage-with-transformer
I tested with a subset of the nq_dev dataset. Here are the results of a pipeline with ElasticsearchRetriever and SentenceTransformerRanker with
"cross-encoder/ms-marco-MiniLM-L-12-v2"
as model:Limitations: documentation on the website has not been updated. It might be unclear/confusing for users at the moment whether to use FARMRanker or SentenceTransformerRanker.
closes #1129