-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Weaviate] Exit the while loop when we query less documents than available #2537
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes look good 🙂
I wonder if a piece is missing though. Although the original issue is solved (get_all_documents
work regardless of QUERY_MAXIMUM_RESULTS
), still it would be confusing for users not to be able to query more than 10k docs no matter what. We could:
- Add a very loud warning if
get_documents_count
is aboveQUERY_MAXIMUM_RESULTS
, or - Add a very loud warning if a user tries to write more than
QUERY_MAXIMUM_RESULTS
into Weaviate.
In both cases, the user is at least aware that this might be an issue: I have the impression most Haystack users are not aware of this specific Weaviate quirk and would benefit a hint.
I think this is a small change worth adding here. Let me know if you prefer a separate PR for it, in this case I will approve this one.
I went with the former because it looks less obtrusive. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All good! I left a couple of small comments but nothing important 😊
fixes #2517
When a query returns less documents than available in the store (e.g. when there are more documents than
QUERY_MAXIMUM_RESULTS
(see Weaviate docs), thewhile
loop paginating the results never exits. This is because the offset increases as we fetch more documents, but afterQUERY_MAXIMUM_RESULTS
docs, the query starts returning an empty list, the offset stays the same and we get stuck.Proposed changes:
Status (please check what you already did):