Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ESQL: Ask for correct field names if only LOOKUP/ENRICH fields are kept #120372

Draft
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

ivancea
Copy link
Contributor

@ivancea ivancea commented Jan 17, 2025

Fixes #120272

This is just a draft as the solution is quite bruteforce-y.

Some context on the bug

The issue occurs as follows, top-down (Read linked issue for context on the bug):

  • A query like FROM index | EVAL x = 1 | LOOKUP l ON x | KEEP lookup_field returns no rows. Exactly the same with ENRICH
  • It returns no rows because the FROM relationship is replaced by an EMPTY LocalRelationship
  • The EMPTY rel comes from the EsIndex having no concrete indices
  • It has no concrete indices because field-caps returns no results for that index+fields
  • The fields come from fieldNames(), which must return an existing field, or _index if there's no field to query
  • There's a logic in fieldNames() that says "If no fields, then use _index". It's not working, because the fields requested from LOOKUP appear in the list of fields

Potential solution

Givens:

  • We don't know which fields every lookup returns, so we can't substract them from the list
  • We would have to first make the field-caps for lookup
  • But the field-caps requets for lookup depends on the results of fieldNames()

Which means we will probably have to reorganize the EsqlSession.analyzedPlan() code to correctly trace every field.

Current solution

This PR's solution is a simple, lazy one. It works well apparently, and I suppose it wasn't done before to try to keep the fieldNames "exact". But since enrich&lookup, it wasn't exact anymore, and we were asking for enrich/lookup index fields indistinctly.

Consider this a temporary solution until analyzedPlan() and fieldNames() get a refactor

@ivancea ivancea added >bug Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) auto-backport Automatically create backport pull requests when merged :Analytics/ES|QL AKA ESQL v9.0.0 v8.18.0 labels Jan 17, 2025
@elasticsearchmachine
Copy link
Collaborator

Hi @ivancea, I've created a changelog YAML for you.

Copy link
Contributor

@alex-spies alex-spies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that making fieldNames more accurate requires quite some work and would also increase the blast radius of the fix. That's because I believe fieldNames would have to go down the parsed plan, and whenever we encounter a LOOKUP we'd need to perform a field caps call before moving further down.

We had discussions about the required precision of fieldNames not too long ago between @astefan , @craigtaverner and myself. At the time, we agreed that asking for too many fields was generally okay. (Although it leads to larger than required field caps requests/responses.)

In that light, I believe the proposed fix here is very nice and contained. But I believe @astefan should have a good hard look at this.

Copy link
Contributor

@alex-spies alex-spies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we decide to go forward with this approach, it'll require a couple nice unit tests for the change to fieldNames - and we should try and be as creative as possible with these.

The two test classes are xpack/esql/session/IndexResolverFieldNamesTests.java and /xpack/esql/qa/rest/RequestIndexFilteringTestCase.java - you can also compare with the tests introduced in Craig's PR here.

@alex-spies alex-spies requested a review from astefan January 17, 2025 13:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL auto-backport Automatically create backport pull requests when merged >bug Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.18.0 v9.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ESQL: No rows with JOIN/ENRICH + EVAL + KEEP
3 participants