-
Notifications
You must be signed in to change notification settings - Fork 521
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DOC] Faiss Scalar Quantization FP16 (SQfp16) #5038
Comments
Please add V2.11 label |
Please change the label to v2.12 |
@naveentatikonda - Is this still on track for 2.12? Thanks! |
Yes @hdhalter this feature should be ready for 2.12. We are just waiting on the Faiss maintainers to merge our changes. Thanks! |
1 task
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
What do you want to do?
Tell us about your request. Provide a summary of the request and all versions that are affected.
In k-NN plugin we mainly support vectors of type float where each dimension is 32 bits. This is getting expensive for use cases that requires ingestion on a large scale where we need to construct, load, save and search graphs(for native engines nmslib and faiss) which is getting even more costlier. Even though we have the byte vector support, it only supports lucene engine and also there is a considerable reduction in the recall when compared to float 32.
Adding support for Faiss SQFP16 helps to reduce the memory and storage footprints without compromising on recall where when user provides the 32 bit float vectors, the Faiss engine quantizes the vector into FP16 using their scalar quantizer (users don’t need to do any quantization on their end), stores it and decodes it back to FP32 while returning the results during search operations.
This feature is expected to be launched in OS version 2.12. So, it will be supported in 2.12 and future versions.
What other resources are available? Provide links to related issues, POCs, steps for testing, etc.
opensearch-project/k-NN#1138
opensearch-project/k-NN#830
The text was updated successfully, but these errors were encountered: