Elasticsearch compatibility: FAQ-style QA search_phase_execution_exception, script_score script returned an invalid score #483

aaronbriel · 2020-10-13T15:20:04Z

Describe the bug
When I attempt to call get_answers_via_similar_questions using FAQ-style QA an error is thrown. This occurs when I use two separate indices, one for QA-style and another for FAQ-style. I want to leverage a QA instance and an FAQ with their own separate indices for experimental purposes.

Error message

10/13/2020 06:43:06 - WARNING - elasticsearch -   POST http://localhost:9200/faq/_search [status:400 request:0.130s]
Traceback (most recent call last):
  File "chatbots/haystack/faqbot.py", line 214, in <module>
    main()
  File "chatbots/haystack/faqbot.py", line 186, in main
    faqbot.interact(question)
  File "chatbots/haystack/faqbot.py", line 57, in interact
    prediction = finder.get_answers_via_similar_questions(question=question, top_k_retriever=1)
  File "/Users/aaronbriel/chatbots/haystack/.venv/src/farm-haystack/haystack/finder.py", line 95, in get_answers_via_similar_questions
    documents = self.retriever.retrieve(question, top_k=top_k_retriever, filters=filters, index=index)
  File "/Users/aaronbriel/chatbots/haystack/.venv/src/farm-haystack/haystack/retriever/dense.py", line 327, in retrieve
    top_k=top_k, index=index)
  File "/Users/aaronbriel/chatbots/haystack/.venv/src/farm-haystack/haystack/document_store/elasticsearch.py", line 465, in query_by_embedding
    result = self.client.search(index=index, body=body, request_timeout=300)["hits"]["hits"]
  File "/Users/aaronbriel/chatbots/haystack/.venv/lib/python3.7/site-packages/elasticsearch/client/utils.py", line 152, in _wrapped
    return func(*args, params=params, headers=headers, **kwargs)
  File "/Users/aaronbriel/chatbots/haystack/.venv/lib/python3.7/site-packages/elasticsearch/client/__init__.py", line 1617, in search
    body=body,
  File "/Users/aaronbriel/chatbots/haystack/.venv/lib/python3.7/site-packages/elasticsearch/transport.py", line 392, in perform_request
    raise e
  File "/Users/aaronbriel/chatbots/haystack/.venv/lib/python3.7/site-packages/elasticsearch/transport.py", line 365, in perform_request
    timeout=timeout,
  File "/Users/aaronbriel/chatbots/haystack/.venv/lib/python3.7/site-packages/elasticsearch/connection/http_urllib3.py", line 269, in perform_request
    self._raise_error(response.status, raw_data)
  File "/Users/aaronbriel/chatbots/haystack/.venv/lib/python3.7/site-packages/elasticsearch/connection/base.py", line 301, in _raise_error
    status_code, error_message, additional_info
elasticsearch.exceptions.RequestError: RequestError(400, 'search_phase_execution_exception', 'script_score script returned an invalid score [-3.4487913] for doc [41]. Must be a non-negative score!')
make: *** [ask_faq] Error 1

Expected behavior
I expected the request to return results.

Additional context
I'm using code very similar to Tutorial4_FAQ_style_QA.ipynb and Tutorial1_Basic_QA_Pipeline.ipynb, except for one difference. I first store data in elasticsearch using 2 indices (using the same storage code for each tutorial above), once for the QA-style and then for the FAQ-style. Each has their own unique elasticsearch index.

When I run a query to the QA get_answers method the expected result is returned. However, when I run the query to the FAQ get_answers_via_similar_questions I see the error. Looking at #408 I checked the mapping of the FAQ index which seems correct:

{
   "faq":{
      "mappings":{
         "dynamic_templates":[
            {
               "strings":{
                  "path_match":"*",
                  "match_mapping_type":"string",
                  "mapping":{
                     "type":"keyword"
                  }
               }
            }
         ],
         "properties":{
            "answer_html":{
               "type":"keyword"
            },
            "category":{
               "type":"keyword"
            },
            "city":{
               "type":"keyword"
            },
            "country":{
               "type":"keyword"
            },
            "lang":{
               "type":"keyword"
            },
            "last_update":{
               "type":"keyword"
            },
            "link":{
               "type":"keyword"
            },
            "name":{
               "type":"keyword"
            },
            "question":{
               "type":"keyword"
            },
            "question_emb":{
               "type":"dense_vector",
               "dims":768
            },
            "region":{
               "type":"keyword"
            },
            "source":{
               "type":"keyword"
            },
            "text":{
               "type":"text"
            }
         }
      }
   }
}

To Reproduce

Instantiate QA ElasticsearchDocumentStore, ElasticsearchRetriever, FarmReader and store SQUAD style json using convert_files_to_dicts and document_store.write_documents(dicts) following Tutorial1_Basic_QA_Pipeline.ipynb but with custom index
Instantiate FAQ ElasticsearchDocumentStore and EmbeddingRetriever and store csv (faq_covidbert.csv) using convert_files_to_dicts and document_store.write_documents(dicts) following Tutorial4_FAQ_style_QA.ipynb but with custom index
Call get_answers_via_similar_questions with top_k_retriever=1

System:

OS:
GPU/CPU:
Haystack version (commit or version number): latest master branch
DocumentStore: ElasticsearchDocumentStore
Reader: None
Retriever: EmbeddingRetriever

The text was updated successfully, but these errors were encountered:

aaronbriel · 2020-10-13T16:33:47Z

I've discovered that the issue is due to the version of elasticsearch. In my docker-compose I was referencing version 7.9.2. When I reverted this to 7.6.2 the issue was resolved. So it looks like get_answers is compatible with 7.9.2 but get_answers_via_similar_questions is not.

aaronbriel · 2020-10-13T16:34:35Z

I will close this - I'm guessing you are aware of the compatibility issue...

lalitpagaria · 2020-10-19T16:08:31Z

@tholor @tanaysoni better to address this compatibility issue

tholor · 2020-10-19T16:20:18Z

Yes, let's fix it

aaronbriel added the type:bug Something isn't working label Oct 13, 2020

aaronbriel closed this as completed Oct 13, 2020

tholor reopened this Oct 19, 2020

tholor assigned tanaysoni Oct 20, 2020

tholor changed the title ~~FAQ-style QA search_phase_execution_exception, script_score script returned an invalid score~~ Elasticsearch compatibility: FAQ-style QA search_phase_execution_exception, script_score script returned an invalid score Oct 21, 2020

tholor added this to the #3 milestone Oct 21, 2020

tanaysoni mentioned this issue Oct 23, 2020

Fix scoring in Elasticsearch for dot product #517

Merged

tanaysoni closed this as completed in #517 Oct 23, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Elasticsearch compatibility: FAQ-style QA search_phase_execution_exception, script_score script returned an invalid score #483

Elasticsearch compatibility: FAQ-style QA search_phase_execution_exception, script_score script returned an invalid score #483

aaronbriel commented Oct 13, 2020

aaronbriel commented Oct 13, 2020

aaronbriel commented Oct 13, 2020

lalitpagaria commented Oct 19, 2020

tholor commented Oct 19, 2020

Elasticsearch compatibility: FAQ-style QA search_phase_execution_exception, script_score script returned an invalid score #483

Elasticsearch compatibility: FAQ-style QA search_phase_execution_exception, script_score script returned an invalid score #483

Comments

aaronbriel commented Oct 13, 2020

aaronbriel commented Oct 13, 2020

aaronbriel commented Oct 13, 2020

lalitpagaria commented Oct 19, 2020

tholor commented Oct 19, 2020