-
Notifications
You must be signed in to change notification settings - Fork 248
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] Get "can't connect to retriever" error when concurrency exceeds 32 #1556
Comments
What GenAIExamples/ChatQnA/benchmark/performance/kubernetes/intel/gaudi/benchmark.yaml Line 37 in 0c0edff
|
@leslieluyu |
I just found that the issue was mainly caused by LOGFLAG=True, the retriever will have heavy load. |
Could share some numbers so we get an idea the LOGFLAG affect the performance.
The regression of 1.2 should be root caused and analyzed. |
@leslieluyu |
Comparing to LLM and reranking, the retriever consumes little computing resource and memories (both capacities and bandwidth). For the performance regression, need to double check the retriever log. If blocked by retriever, it is likely the Sync IO/AsyncIO, timout etc. @leslieluyu
Would you please share the retrieval container log?If not easy to separate the container logs, you can pull all the logs here. |
Priority
P2-High
OS type
Ubuntu
Hardware type
Gaudi2
Installation method
Deploy method
Running nodes
Single Node
What's the version?
chatqna v1.2 norerank
chatqna-chatqna-ui-695995789c-sz67q opea/chatqna-ui:1.2
chatqna-data-prep-67f484b58f-xwvct opea/dataprep:1.2
chatqna-db8987c4c-slm6z opea/chatqna-without-rerank:1.2
chatqna-nginx-6d9df4b75b-swts2 opea/nginx:latest
chatqna-redis-vector-db-66c94f7fc5-csfx2 redis/redis-stack:7.2.0-v9
chatqna-retriever-usvc-5b64ff97c8-4fkd9 opea/retriever:1.2
chatqna-tei-7fc4845868-lr2wx ghcr.io/huggingface/text-embeddings-inference:cpu-1.5
chatqna-tgi-f5fc79849-bhrsk ghcr.io/huggingface/tgi-gaudi:2.3.1
chatqna-tgi-f5fc79849-fv48f ghcr.io/huggingface/tgi-gaudi:2.3.1
chatqna-tgi-f5fc79849-jdmwj ghcr.io/huggingface/tgi-gaudi:2.3.1
chatqna-tgi-f5fc79849-jwsxb ghcr.io/huggingface/tgi-gaudi:2.3.1
chatqna-tgi-f5fc79849-nhkdj ghcr.io/huggingface/tgi-gaudi:2.3.1
chatqna-tgi-f5fc79849-q5glp ghcr.io/huggingface/tgi-gaudi:2.3.1
chatqna-tgi-f5fc79849-td5lk ghcr.io/huggingface/tgi-gaudi:2.3.1
chatqna-tgi-f5fc79849-zxtr9 ghcr.io/huggingface/tgi-gaudi:2.3.1
Description
Get error when load is heavy.
Use benchmark to get perf data. there are error message : "can't connect to retriever" (see message below) when concurrency exceed 32 .
There are no error when concurrency is below 16(1,2,4,8,16).
This phenomenon only occurs in version 1.2; it did not exist in previous versions(v1.1,v1.0,v0.8, etc.)
Reproduce steps
Raw log
Attachments
No response
The text was updated successfully, but these errors were encountered: