Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Elasticsearch compatibility: FAQ-style QA search_phase_execution_exception, script_score script returned an invalid score #483

Closed
aaronbriel opened this issue Oct 13, 2020 · 4 comments · Fixed by #517
Assignees
Labels
type:bug Something isn't working
Milestone

Comments

@aaronbriel
Copy link

Describe the bug
When I attempt to call get_answers_via_similar_questions using FAQ-style QA an error is thrown. This occurs when I use two separate indices, one for QA-style and another for FAQ-style. I want to leverage a QA instance and an FAQ with their own separate indices for experimental purposes.

Error message

10/13/2020 06:43:06 - WARNING - elasticsearch -   POST http://localhost:9200/faq/_search [status:400 request:0.130s]
Traceback (most recent call last):
  File "chatbots/haystack/faqbot.py", line 214, in <module>
    main()
  File "chatbots/haystack/faqbot.py", line 186, in main
    faqbot.interact(question)
  File "chatbots/haystack/faqbot.py", line 57, in interact
    prediction = finder.get_answers_via_similar_questions(question=question, top_k_retriever=1)
  File "/Users/aaronbriel/chatbots/haystack/.venv/src/farm-haystack/haystack/finder.py", line 95, in get_answers_via_similar_questions
    documents = self.retriever.retrieve(question, top_k=top_k_retriever, filters=filters, index=index)
  File "/Users/aaronbriel/chatbots/haystack/.venv/src/farm-haystack/haystack/retriever/dense.py", line 327, in retrieve
    top_k=top_k, index=index)
  File "/Users/aaronbriel/chatbots/haystack/.venv/src/farm-haystack/haystack/document_store/elasticsearch.py", line 465, in query_by_embedding
    result = self.client.search(index=index, body=body, request_timeout=300)["hits"]["hits"]
  File "/Users/aaronbriel/chatbots/haystack/.venv/lib/python3.7/site-packages/elasticsearch/client/utils.py", line 152, in _wrapped
    return func(*args, params=params, headers=headers, **kwargs)
  File "/Users/aaronbriel/chatbots/haystack/.venv/lib/python3.7/site-packages/elasticsearch/client/__init__.py", line 1617, in search
    body=body,
  File "/Users/aaronbriel/chatbots/haystack/.venv/lib/python3.7/site-packages/elasticsearch/transport.py", line 392, in perform_request
    raise e
  File "/Users/aaronbriel/chatbots/haystack/.venv/lib/python3.7/site-packages/elasticsearch/transport.py", line 365, in perform_request
    timeout=timeout,
  File "/Users/aaronbriel/chatbots/haystack/.venv/lib/python3.7/site-packages/elasticsearch/connection/http_urllib3.py", line 269, in perform_request
    self._raise_error(response.status, raw_data)
  File "/Users/aaronbriel/chatbots/haystack/.venv/lib/python3.7/site-packages/elasticsearch/connection/base.py", line 301, in _raise_error
    status_code, error_message, additional_info
elasticsearch.exceptions.RequestError: RequestError(400, 'search_phase_execution_exception', 'script_score script returned an invalid score [-3.4487913] for doc [41]. Must be a non-negative score!')
make: *** [ask_faq] Error 1

Expected behavior
I expected the request to return results.

Additional context
I'm using code very similar to Tutorial4_FAQ_style_QA.ipynb and Tutorial1_Basic_QA_Pipeline.ipynb, except for one difference. I first store data in elasticsearch using 2 indices (using the same storage code for each tutorial above), once for the QA-style and then for the FAQ-style. Each has their own unique elasticsearch index.

When I run a query to the QA get_answers method the expected result is returned. However, when I run the query to the FAQ get_answers_via_similar_questions I see the error. Looking at #408 I checked the mapping of the FAQ index which seems correct:

{
   "faq":{
      "mappings":{
         "dynamic_templates":[
            {
               "strings":{
                  "path_match":"*",
                  "match_mapping_type":"string",
                  "mapping":{
                     "type":"keyword"
                  }
               }
            }
         ],
         "properties":{
            "answer_html":{
               "type":"keyword"
            },
            "category":{
               "type":"keyword"
            },
            "city":{
               "type":"keyword"
            },
            "country":{
               "type":"keyword"
            },
            "lang":{
               "type":"keyword"
            },
            "last_update":{
               "type":"keyword"
            },
            "link":{
               "type":"keyword"
            },
            "name":{
               "type":"keyword"
            },
            "question":{
               "type":"keyword"
            },
            "question_emb":{
               "type":"dense_vector",
               "dims":768
            },
            "region":{
               "type":"keyword"
            },
            "source":{
               "type":"keyword"
            },
            "text":{
               "type":"text"
            }
         }
      }
   }
}

To Reproduce

  1. Instantiate QA ElasticsearchDocumentStore, ElasticsearchRetriever, FarmReader and store SQUAD style json using convert_files_to_dicts and document_store.write_documents(dicts) following Tutorial1_Basic_QA_Pipeline.ipynb but with custom index
  2. Instantiate FAQ ElasticsearchDocumentStore and EmbeddingRetriever and store csv (faq_covidbert.csv) using convert_files_to_dicts and document_store.write_documents(dicts) following Tutorial4_FAQ_style_QA.ipynb but with custom index
  3. Call get_answers_via_similar_questions with top_k_retriever=1

System:

  • OS:
  • GPU/CPU:
  • Haystack version (commit or version number): latest master branch
  • DocumentStore: ElasticsearchDocumentStore
  • Reader: None
  • Retriever: EmbeddingRetriever
@aaronbriel aaronbriel added the type:bug Something isn't working label Oct 13, 2020
@aaronbriel
Copy link
Author

I've discovered that the issue is due to the version of elasticsearch. In my docker-compose I was referencing version 7.9.2. When I reverted this to 7.6.2 the issue was resolved. So it looks like get_answers is compatible with 7.9.2 but get_answers_via_similar_questions is not.

@aaronbriel
Copy link
Author

I will close this - I'm guessing you are aware of the compatibility issue...

@lalitpagaria
Copy link
Contributor

@tholor @tanaysoni better to address this compatibility issue

@tholor
Copy link
Member

tholor commented Oct 19, 2020

Yes, let's fix it

@tholor tholor reopened this Oct 19, 2020
@tholor tholor changed the title FAQ-style QA search_phase_execution_exception, script_score script returned an invalid score Elasticsearch compatibility: FAQ-style QA search_phase_execution_exception, script_score script returned an invalid score Oct 21, 2020
@tholor tholor added this to the #3 milestone Oct 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants