Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pkg/ottl] Update statements to reflect their context #29017

Open
Tracked by #28892
TylerHelmuth opened this issue Nov 7, 2023 · 9 comments
Open
Tracked by #28892

[pkg/ottl] Update statements to reflect their context #29017

TylerHelmuth opened this issue Nov 7, 2023 · 9 comments
Assignees
Labels
never stale Issues marked with this label will be never staled and automatically removed pkg/ottl priority:p2 Medium

Comments

@TylerHelmuth
Copy link
Member

Component(s)

pkg/ottl

Is your feature request related to a problem? Please describe.

Currently OTTL statements are independent of any context. Users must tell OTTL the context in which the statement is should be run.

This results in OTTL statements not being directly portable between components. Unless the config implementing OTTL is the same between components, the OTTL statement may not be able to be copied from one component to another.

If the OTTL statement expresses the context itself then there is no ambiguity and the statement would mean the same thing in any component.

Describe the solution you'd like

Update OTTL statements to express the context, probably via the path names. When parsing the statement, infer the context to be the lowest context defined in the statement.

The Parser needs updated to be told which contexts to consider. To start, one implementation would be to try parsing the statement in each context, starting with the "highest" context. The first context to pass parsing is used.

Describe alternatives you've considered

No response

Additional context

No response

@TylerHelmuth TylerHelmuth added enhancement New feature or request needs triage New item requiring triage priority:p2 Medium pkg/ottl and removed enhancement New feature or request needs triage New item requiring triage labels Nov 7, 2023
Copy link
Contributor

github-actions bot commented Jan 8, 2024

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Jan 8, 2024
@TylerHelmuth TylerHelmuth added never stale Issues marked with this label will be never staled and automatically removed and removed Stale labels Jan 8, 2024
@jsuereth
Copy link
Contributor

For this issue, I was thinking of proposing the following syntax:

on <context>
(where <expr>) ?
(yield <expr> | drop)

You could then allow statements to either be what they are today, or this new syntax which carries the context.

Example:

on metric
where metric.attributes["environment"] == "qa"
drop

Note: the yield syntax is part of a larger proposal. For now you could ignore that instead just have the last part of the grammar be existing statements like Route(), Drop(), set(...).

@TylerHelmuth
Copy link
Member Author

TylerHelmuth commented Jan 26, 2024

@jsuereth for context, during a working session at Kubecon we imagine the solution would be to include the context in all paths. So something like:

transform:
  error_mode: ignore
  trace_statements:
    - context: resource
      statements:
        - keep_keys(attributes, ["service.name", "service.namespace", "cloud.region", "process.command_line"])
    - context: span
      statements:
        - set(status.code, 1) where attributes["http.path"] == "/health"

becomes

transform:
  error_mode: ignore
  traces_statements:
      - keep_keys(resource.attributes, ["service.name", "service.namespace", "cloud.region", "process.command_line"])
      - set(span.status.code, 1) where span.attributes["http.path"] == "/health"

And the parser could infer the context based on the "lowest" specified context in the statement. A big benefit to this solution is the ability to build the new statements from existing config. I believe there is a way to implement this solution without incurring a breaking change, leaving both configurations available until OTTL goes stable.

@edmocosta
Copy link
Contributor

Hi @TylerHelmuth!

I'm not sure if you already have an implementation in mind, but I did prototype a solution for this issue locally, and it would be nice to discuss it with you. The main changes I'm proposing would be:

Transform Processor Configuration:

For supporting both configuration styles, we could implement a custom Unmarshal function for the transformprocessor.Config that would work for flat, structured and mixed configuration styles. A few details:

  • No transformprocessor.Config structure changes, statements would continue to use the same struct, but letting Context and Conditions empty when configuring using the flat format.
  • For flat configurations, each statement line becomes an individual common.ContextStatements object, so we can infer their context's name later on the parser, without the need of grouping them by context, for example.
  • Only the confmap.Conf would be manipulated (simulatingContextStatements and merging the config), the parsing itself would continue to be done by mapstructure.

Example of mixed configuration that would also work with the spiked Unmarshal function:

trace_statements:
  - set(span.status.code, 1) where span.attributes["http.path"] == "/health"
  - set(span.status.code, 1) where span.attributes["http.path"] == "/health"
  - context: spanevent
     statements:
       - set(attributes["domain"], "test.com")

OTTL contexts:

Given the context name would become part of the path, all OTTL contexts parsePath functions would need to be changed to ignore it, for example, the ottllog parsePath should contain an extra condition comparing the first path.Name() with the context name, skipping it to path.Next() if it matches.

An alternative to skipping the path's context would be changing the grammar to extract the context name to another field, which demonstrated to be much more complex giving that valid contexts names vary depending on the parent's context. In addition to that, It wouldn't be useful to infer the context, as the transform processor needs to know it beforehand to then execute the proper parser.

Expressing that on participle is also challenging. If we had a different syntax in mind, e.g. the suggested on metric .., it probably would be easier.

That extra context condition wouldn't affect other processor's OTTL usages (routingprocessor, filterprocessor, etc). I couldn't find any context/field name conflicts at this time (e.g. log context supporting log field), so it wouldn't skip anything for existing usages.

Transform processor:

*.ParseContextStatements functions should be changed to infer the context when necessary (.Context == ""),
It could be done by trying to parse it in each context or be using some shared inferrer implementation.

Considering we know at parsing time which contexts names are valid, and that each flat configuration statement line becomes a separate common.ContextStatements object, this inferrer implementation could just regex the valid contexts names from the statements, picking the highest one. For example:

For the given configuration:

trace_statements:
 - keep_keys(resource.attributes, ["service.name", "service.namespace", "cloud.region", "process.command_line"])
 - set(span.status.code, 1) where spanevent.attributes["http.path"] == "/health"

It would use pre-compiled regexes (per statement type) to find all valid context names on the statement. For this trace_statements example, it would use something like:

(?:[^"])(?P<context>span|spanevent|resource|scope)(?:\.(?P<field>\w+))

Given the above regex captured context names, it would infer the resource context for the first statement, and spanevent for the second one, as spanevent > span.

Trying parsing the statement in each context would probably be easier & safer, but in terms of performance, I think the regex-based infer mechanism - although not optimal - would be faster depending on the statement. For example, parsing a statement with a lower context set(span.status.code, 1) where span.attributes["http.path"] == "/health" would perform as:

BenchmarkParseContextStatements-10                      16578	    72180 ns/op	   55083 B/op	     729 allocs/op
BenchmarkParseContextStatementsRegexInferred-10         16192	    73589 ns/op	   55571 B/op	     736 allocs/op
BenchmarkParseContextStatementsTwoCtxAttempts-10         8635	   135392 ns/op	   93686 B/op	    1292 allocs/op

As you can see on the benchmark, the regex has an overhead, but it's still much smaller than trying to parse the statement in another context (spanevent, then span), unless it succeeds at the first attempt, but I'm not sure how common that would be. WDYT?


As I mentioned, I've ran this proposal code locally and it seems to be working fine so far, I'm still testing it, and I'm probably missing some points. It would be great if you could give it a thought and share your opinion. I'd be happy to implement any other idea you might have in mind.

Thank you!

@TylerHelmuth
Copy link
Member Author

@edmocosta thanks for the excellent investigation into this issue.

An alternative to skipping the path's context would be changing the grammar to extract the context name to another field

The original discussion we had at Kubecon NA 2023 was that we do want to build this concept into the grammar. We liked this idea because it makes OTTL syntax consistent everywhere. One of the problems today is that an OTTL statement (or condition) cannot be copy and pasted from 1 component to another unless you're use the context is the same. There is also nothing enforcing to components express the desired context in the same way. Requiring the context as part of the grammar would solve that problem.

Another reason we liked this idea is it made paths more readable. When the current context is defined somewhere else in the config (or even the statement), then a reader must remember what the current context is when reading a path that has not prefix. But if the context must always prefix a path, then there is no ambiguity.

As for the technical implementation I believe it is possible to add all this business logic somewhere in the parser/ottlcontext and not the grammar itself. I believe our idea was something like "give a path has been parsed, the first part MUST be the context. If a path is only 1 part, it is invalid", or something like that.

An actual change to the grammar is probably a better solution bc we could enforce all paths have a context and capture it in its own field

type path struct {
        Context string `parser:"@Lowercase '.'"`
	Fields []field `parser:"@@ ( '.' @@ )*"`
}

Or something like that.

In addition to that, It wouldn't be useful to infer the context, as the transform processor needs to know it beforehand to then execute the proper parser.

I believe this issue could be solved by adding some more features to OTTL's package. At Kubecon we discussed a brute force solution (because it happens at startup) where the parser infers the context to be the "lowest" context defined in the statement. In addition, the parser would need to be told which contexts to consider (and their order), and it would try parsing at the "highest" context, returning the parsed statements for the fist context that was able to successfully parse all the statements. It would probably look pretty similar to the transformprocessor's existing ParserCollection concept.

Using regex to identify the concept also seems reasonable, but I believe it should be a OTTL concern, not a component concern. Performance will end up being a secondary concern since this is all happening at startup. I like the concept of not needing to brute force to check which context should be used.

Example of mixed configuration that would also work with the spiked Unmarshal function:

I really like the sound of this idea, at least for some amount of transition time. Our goal with this change was to try and not break existing OTTL users - we figured we'd be able to build the new statements using the existing config structs. It sounds like you've confirmed this is possible.

As for breaking changes in OTTL's public API, we can do those all we want.

If you're interested in continuing this work, I am very interested in seeing what it is like to implement a solution via the grammar instead of via the transformprocessor.

@edmocosta
Copy link
Contributor

Hi @TylerHelmuth!

As for the technical implementation I believe it is possible to add all this business logic somewhere in the parser/ottlcontext and not the grammar itself. I believe our idea was something like "give a path has been parsed, the first part MUST be the context. If a path is only 1 part, it is invalid", or something like that.

An actual change to the grammar is probably a better solution bc we could enforce all paths have a context and capture it in its own field

Yes, I completely agree with that idea, it's simpler to read and to understand and I think that's the way to go.
Regarding the technical details, my only concern on that is backward compatibility, if we capture/enforce the first path's segment to be the context name, it would probably break existing configurations, as paths like attributes["foo"] and status.code would be considered invalid, failing on the validation, and/or being parsed as {Context: "status", Name: "code", Next: <nil>} (causing invalid path segment errors on the context's parsePath functions).

That's definitely not a blocker for implementing the proposed solution in a BC way, during the transition time, we could maybe handle it on the parser, and instead of using the valid contexts list to validate the path prefix, we could use it to temporary keep BC, changing the parser newPath method adding something like:

fields := path.Fields
...
if path.Context != "" {
  if _, ok := p.validContexts[path.Context]; !ok {
    // if the context is not recognised, it's likely to be part of the field name (old structure without context).
    // when the transition is complete, we can just return an error here.
    fields = append([]field{{Name: path.Context}}, fields...)

    // maybe log some warn/deprecation message about paths without context?
  }
}

Pros of changing this function instead of adding the logic into the ottlcontexts are backward compatibility from the source (parser), no need to change the OTTL context's parsePath functions, and easily switch it to require/validate the path context name when the transition time is considered complete.

On the other other hand, the components consuming those paths would still need to handle both versions, with and without the context prefix. No validations would happen at the transition time.

Do you have any thoughts on this and how it should be handled?

It would probably look pretty similar to the transformprocessor's existing ParserCollection concept.

I see, that makes sense! As you mentioned, I think we could come up with a very similar ParserCollection implementation in the ottl package (ottl/parsercollection perhaps), but a bit more generic, supporting parser collections for logs, metrics and traces contexts, and handling the context inferring when necessary. It could also be used to replace all the existing transformprocessor's ParserCollections.

I'm not sure yet if we can implement it totally generic, without any dependency on the ottlcontexts for example (probably not). Either way, IMO, having the ottlcontext as a dependency wouldn't be an issue considering the parser collection's purpose. I'll investigate it further during the implementation.

Using regex to identify the concept also seems reasonable, but I believe it should be a OTTL concern, not a component concern. Performance will end up being a secondary concern since this is all happening at startup. I like the concept of not needing to brute force to check which context should be used.

Agreed.

I really like the sound of this idea, at least for some amount of transition time. Our goal with this change was to try and not break existing OTTL users - we figured we'd be able to build the new statements using the existing config structs. It sounds like you've confirmed this is possible.

Correct! that's possible.


Please let me know your thoughts!

Thanks!

@TylerHelmuth
Copy link
Member Author

it would probably break existing configurations, as paths like attributes["foo"] and status.code would be considered invalid, failing on the validation, and/or being parsed as {Context: "status", Name: "code", Next: } (causing invalid path segment errors on the context's parsePath functions).

I agree that a change in the grammar would be a breaking change for parsing a statement like set(attributes["foo"], "bar"). I think a way around this is for component that are using OTTL to modify the user's configured string using the configured context. So for the transform processor, if a user configured

transform:
  trace_statements:
    - context: span
      statements:
        - set(attributes["test"], "pass")

Then the string passed into the parser would be changed to set(span.attributes["test"], "pass"). I believe another change we'd need is to create some sort of wrapper that would obfuscate the Parser types from callers. So instead of using the context value to determine that ottlspan.NewParser needs to be invoked, we'd pass the statements to the wrapper and it would figure out which parser needs invoked.

If we can modify the input string using the configured context then I believe we can do this change with no breaking changes to user config. There will definitely be breaking changes to OTTL APIs in places, but we aren't worried about breaking those.

With this strategy components can then expose support for both the old style and the new style allowing a transition period. The component could even print out the modified statements as a way to help users transition.

An additional idea to play around with would be making the context selector in the grammar optional. This may allow both old and new syntax to be parsed by the same grammar, allowing us to make a bunch of additions/changes to pkg/ottl's API without also needing to update all the components in the same PR. The more work we can do in small batches the better (but that typically has been difficult when breaking OTTL's grammar/API).

Pros of changing this function instead of adding the logic into the ottlcontexts are backward compatibility from the source (parser), no need to change the OTTL context's parsePath functions, and easily switch it to require/validate the path context name when the transition time is considered complete.

I would definitely love to avoid having to change parsePath functions. I think adding the context selector to the grammar allows us to keep that business logic in the parser wrapper helper thing and the contexts can continue to focus only on the path.

@edmocosta
Copy link
Contributor

Hi @TylerHelmuth!

Thank you again for your patience and support, I'm still getting up to speed with the OTel codebase/way of working/standards/etc, so apologise for so many questions and perhaps suggestions that might not make much sense :)

An additional idea to play around with would be making the context selector in the grammar optional. This may allow both old and new syntax to be parsed by the same grammar, allowing us to make a bunch of additions/changes to pkg/ottl's API without also needing to update all the components in the same PR.

I forgot to mention it, but that was the idea behind my previous suggestion. If we make the Context grammar selector optional by adding it as following:

// path represents a telemetry path mathExpression.
type path struct {
	Context string  `parser:"(@Lowercase '.')?"`
	Fields  []field `parser:"@@ ( '.' @@ )*"`
}

We would be able to make it non-breaking to all components - without modifying them - by changing the ottl.Parser[k].newPath function as following:

func (p *Parser[K]) newPath(path *path) (*basePath[K], error) {
	fields := path.Fields
	...
	if path.Context != "" {
	   if p.validContexts == nil || len(p.validPathContexts) == 0 {
              fields = append([]field{{Name: path.Context}}, fields...)
	   } else {
	      if _, ok := p.validContexts[path.Context]; !ok {
		return nil, fmt.Errorf("invalid path context %s.%s: %[1]s", path.Context, originalText)
	     }
	   }
        }
        ...
}

Given that all existing ottl.Parser[K] usages are not setting the parser option WithValidContexts (to-be-added), it would automatically fallback to the previous behaviour, with the grammar's extracted Context value as the first path segment.

I might be wrong, but I think this should be enough to keep the parser and grammar backward compatible, and we could introduce these changes in an individual PR.


The next step would be building the statements rewrite utility, so components would be able to change their config text adding the missing path's context. I'm still unsure how this utility would be implemented: using regex then appending the context perhaps, or maybe parsing the statements text into the grammar AST, changing the paths, and then parsing it back to string (somehow). Any ideas on this?

Alternatively to the rewrite utility, when I mentioned

No validations would happen at the transition time.

I meant (coded like this to compare to the above version):

func (p *Parser[K]) newPath(path *path) (*basePath[K], error) {
	fields := path.Fields
	...
	if path.Context != "" {
	   if p.validContexts == nil || len(p.validContexts) == 0 {
              fields = append([]field{{Name: path.Context}}, fields...)
	   } else {
	      if _, ok := p.validContexts[path.Context]; !ok {
-		  return nil, fmt.Errorf("invalid path context %s.%s: %[1]s", path.Context, originalText)
+                 fields = append([]field{{Name: path.Context}}, fields...)
	     }
	   }
        }
        ...
}

This change would make the parser support both versions, with and without the path's context prefix. When we finish migrating everything, we could change it back to the validation mode.

Please note that the p.validContexts would normally hold a single context name, for example, the ottllog Parser would set it to log, ottlspan to span, and so on.


Finally, we would need to add the parser wrapper implementation, that would infer the context from the statements, pick the right parser, and execute the parse statements operation.

At this point, we could introduce an extra parser config for complementing the WithValidContexts option, enabling the validation mode on demand, and instead of falling back to the previous behaviour as suggested above, enforcing all paths to have a valid context. For example:

p, err := ottl.NewParser[ottllog.TransformContext](
   	functions,
   	pep.parsePath,
   	telemetrySettings,
   	ottl.WithEnumParser[ottllog.TransformContext](parseEnum),
   	ottl.WithValidContexts[ottllog.TransformContext]([]string{"log"})
   	ottl.RequirePathsContext[ottllog.TransformContext](),
   )

Last change would be on the transformprocessor, adapating the configuration as discussed, and using the new parsers wrapper and the statement rewrite utilities.

@TylerHelmuth
Copy link
Member Author

I might be wrong, but I think this should be enough to keep the parser and grammar backward compatible, and we could introduce these changes in an individual PR.

I am definitely interested in trying this approach.

The next step would be building the statements rewrite utility, so components would be able to change their config text adding the missing path's context. I'm still unsure how this utility would be implemented: using regex then appending the context perhaps, or maybe parsing the statements text into the grammar AST, changing the paths, and then parsing it back to string (somehow). Any ideas on this?

I think it is solvable using regex and some assumptions. Like if we know the context configured is span, then any path that doesn't start with reasource or instrumentation_scope needs span appended to it. I think a regex to identify a path in a statement should be doable.

This change would make the parser support both versions, with and without the path's context prefix. When we finish migrating everything, we could change it back to the validation mode.

I think this would also work, but I'd really like to be able to print out for the user what the new statement syntax should be to help them transition. Maybe the ottlcontext could do that instead of the transformprocessor manipulating the input and printing it?

Finally, we would need to add the parser wrapper implementation, that would infer the context from the statements, pick the right parser, and execute the parse statements operation.

This part is still the most hazy aspect of this change in my mind, but if you have and idea I'm looking forward to see it implemented.

@edmocosta at this point I'm interested in seeing an implementation - it doesn't need to pass tests and can include as many breaking changes as you want.

TylerHelmuth pushed a commit that referenced this issue Sep 6, 2024
…a path names (#34875)

**Description:** <Describe what has changed.>
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->

This PR is part of the
#29017,
and introduces the necessary OTTL Grammar/Parser changes to make it
possible for statements express their context via path names.

This first iteration changes the grammar to parse the first `path`
segment as the `Context` name, for example, the path `foo.bar.field`
would be parsed as the following AST example: `{Context: "foo", Fields:
[{Name: "bar"}, {Name: "field"}]}`, instead of `{Fields: [{Name:
"foo"},{Name: "bar"}, {Name: "field"}]}`.

To make it non-breaking to all components during the transition time and
development, this PR also changes the `ottl.Parser[k].newPath` function
to, by default, fall this new behavior back to the previous one, using
the grammar's parsed `Context` value as the first path segment
(`Fields[0]`).

Two new `Parser[K]` options were added, `WithPathContextNames` and
`WithPathContextNameValidation`, The first option will be used to enable
the statements context support, parsing the first path segment as
Context when the value maches any configured context names, or falling
it back otherwise. The second option will be used to enable the
validation and turning off the backward compatible mode, raising an
error when paths have no context prefix, or when it's invalid.

For more details, please check the
#29017
discussion out.

**Link to tracking Issue:**
#29017

**Testing:** 
- Unit tests

**Documentation:**
- No documentation changes were made at this point.
f7o pushed a commit to f7o/opentelemetry-collector-contrib that referenced this issue Sep 12, 2024
…a path names (open-telemetry#34875)

**Description:** <Describe what has changed.>
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->

This PR is part of the
open-telemetry#29017,
and introduces the necessary OTTL Grammar/Parser changes to make it
possible for statements express their context via path names.

This first iteration changes the grammar to parse the first `path`
segment as the `Context` name, for example, the path `foo.bar.field`
would be parsed as the following AST example: `{Context: "foo", Fields:
[{Name: "bar"}, {Name: "field"}]}`, instead of `{Fields: [{Name:
"foo"},{Name: "bar"}, {Name: "field"}]}`.

To make it non-breaking to all components during the transition time and
development, this PR also changes the `ottl.Parser[k].newPath` function
to, by default, fall this new behavior back to the previous one, using
the grammar's parsed `Context` value as the first path segment
(`Fields[0]`).

Two new `Parser[K]` options were added, `WithPathContextNames` and
`WithPathContextNameValidation`, The first option will be used to enable
the statements context support, parsing the first path segment as
Context when the value maches any configured context names, or falling
it back otherwise. The second option will be used to enable the
validation and turning off the backward compatible mode, raising an
error when paths have no context prefix, or when it's invalid.

For more details, please check the
open-telemetry#29017
discussion out.

**Link to tracking Issue:**
open-telemetry#29017

**Testing:** 
- Unit tests

**Documentation:**
- No documentation changes were made at this point.
jriguera pushed a commit to springernature/opentelemetry-collector-contrib that referenced this issue Oct 4, 2024
…a path names (open-telemetry#34875)

**Description:** <Describe what has changed.>
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->

This PR is part of the
open-telemetry#29017,
and introduces the necessary OTTL Grammar/Parser changes to make it
possible for statements express their context via path names.

This first iteration changes the grammar to parse the first `path`
segment as the `Context` name, for example, the path `foo.bar.field`
would be parsed as the following AST example: `{Context: "foo", Fields:
[{Name: "bar"}, {Name: "field"}]}`, instead of `{Fields: [{Name:
"foo"},{Name: "bar"}, {Name: "field"}]}`.

To make it non-breaking to all components during the transition time and
development, this PR also changes the `ottl.Parser[k].newPath` function
to, by default, fall this new behavior back to the previous one, using
the grammar's parsed `Context` value as the first path segment
(`Fields[0]`).

Two new `Parser[K]` options were added, `WithPathContextNames` and
`WithPathContextNameValidation`, The first option will be used to enable
the statements context support, parsing the first path segment as
Context when the value maches any configured context names, or falling
it back otherwise. The second option will be used to enable the
validation and turning off the backward compatible mode, raising an
error when paths have no context prefix, or when it's invalid.

For more details, please check the
open-telemetry#29017
discussion out.

**Link to tracking Issue:**
open-telemetry#29017

**Testing:** 
- Unit tests

**Documentation:**
- No documentation changes were made at this point.
evan-bradley pushed a commit that referenced this issue Oct 7, 2024
**Description:** <Describe what has changed.>
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->

This PR is part of
#29017,
and adds a grammar utility to allow extracting `ottl.path` from parsed
statements and expressions. The new functions will power the context
inferrer utility (see
[draft](#35050)),
which implementation will basically gather all declared statement's
`Path.Context`, and choose the highest priority one.

For the statements rewriter utility purpose, it changes the `grammar.go`
file including a new field `Pos lexer.Position` into the `ottl.path`
struct. This value is automatically set by the `participle` lexer and
holds the path's offsets, being useful for identifying their positions
in the raw statement, without using regexes an being coupled to the
grammar's definition.

**Additional changes**:

This proposal uses the visitor pattern to gather all path's from the
parsed statements. Considering the grammar's custom error mechanism
(`checkForCustomError`) also works with a similar pattern, I've added
the visitor implementation as part of the grammar objects, and
refactored all `checkForCustomError` functions to be part of a validator
visitor. Differently of the current implementation, it joins and
displays all errors at once instead of one by one, IMO, it's specially
useful for statements with more than one error, for example: `set(name,
int(2), float(1))`.

*&ast;* We can change it back do be error-by-error if necessary.
*&ast;* If modifying the custom validator is not desired, I can roll
those changes back keeping only the necessary code to extract the path's
+ `Pos` field.

**Link to tracking Issue:**
#29017

**Testing:** Unit tests were added

**Documentation:** No changes
shalper2 pushed a commit to shalper2/opentelemetry-collector-contrib that referenced this issue Oct 7, 2024
…-telemetry#35174)

**Description:** <Describe what has changed.>
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->

This PR is part of
open-telemetry#29017,
and adds a grammar utility to allow extracting `ottl.path` from parsed
statements and expressions. The new functions will power the context
inferrer utility (see
[draft](open-telemetry#35050)),
which implementation will basically gather all declared statement's
`Path.Context`, and choose the highest priority one.

For the statements rewriter utility purpose, it changes the `grammar.go`
file including a new field `Pos lexer.Position` into the `ottl.path`
struct. This value is automatically set by the `participle` lexer and
holds the path's offsets, being useful for identifying their positions
in the raw statement, without using regexes an being coupled to the
grammar's definition.

**Additional changes**:

This proposal uses the visitor pattern to gather all path's from the
parsed statements. Considering the grammar's custom error mechanism
(`checkForCustomError`) also works with a similar pattern, I've added
the visitor implementation as part of the grammar objects, and
refactored all `checkForCustomError` functions to be part of a validator
visitor. Differently of the current implementation, it joins and
displays all errors at once instead of one by one, IMO, it's specially
useful for statements with more than one error, for example: `set(name,
int(2), float(1))`.

*&ast;* We can change it back do be error-by-error if necessary.
*&ast;* If modifying the custom validator is not desired, I can roll
those changes back keeping only the necessary code to extract the path's
+ `Pos` field.

**Link to tracking Issue:**
open-telemetry#29017

**Testing:** Unit tests were added

**Documentation:** No changes
ghost pushed a commit to sematext/opentelemetry-collector-contrib that referenced this issue Oct 9, 2024
…-telemetry#35174)

**Description:** <Describe what has changed.>
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->

This PR is part of
open-telemetry#29017,
and adds a grammar utility to allow extracting `ottl.path` from parsed
statements and expressions. The new functions will power the context
inferrer utility (see
[draft](open-telemetry#35050)),
which implementation will basically gather all declared statement's
`Path.Context`, and choose the highest priority one.

For the statements rewriter utility purpose, it changes the `grammar.go`
file including a new field `Pos lexer.Position` into the `ottl.path`
struct. This value is automatically set by the `participle` lexer and
holds the path's offsets, being useful for identifying their positions
in the raw statement, without using regexes an being coupled to the
grammar's definition.

**Additional changes**:

This proposal uses the visitor pattern to gather all path's from the
parsed statements. Considering the grammar's custom error mechanism
(`checkForCustomError`) also works with a similar pattern, I've added
the visitor implementation as part of the grammar objects, and
refactored all `checkForCustomError` functions to be part of a validator
visitor. Differently of the current implementation, it joins and
displays all errors at once instead of one by one, IMO, it's specially
useful for statements with more than one error, for example: `set(name,
int(2), float(1))`.

*&ast;* We can change it back do be error-by-error if necessary.
*&ast;* If modifying the custom validator is not desired, I can roll
those changes back keeping only the necessary code to extract the path's
+ `Pos` field.

**Link to tracking Issue:**
open-telemetry#29017

**Testing:** Unit tests were added

**Documentation:** No changes
TylerHelmuth pushed a commit that referenced this issue Oct 31, 2024
… paths context (#35716)

<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
#### Description

This PR is part of
#29017,
and adds the `ottl.Parser[K].AppendStatementPathsContext` function,
allowing components to rewrite statements appending missing `ottl.path`
context names.

For examples, the following context-less statement:

```
set(value, 1) where name == attributes["foo.name"]
```

Would be rewritten using the `span` context as:

```

set(span.value, 1) where span.name == span.attributes["foo.name"]
```

**Why do we need to rewrite statements?**

This utility will be used during the transition from structured OTTL
statements to flat statements.
Components such as the `transformprocessor` will leverage it to support
both configuration styles, without forcing
users to adapt/rewrite their existing config files. 

Once the component turns on the `ottl.Parser[K]` path's context
validation, new configuration style usages will be validated, requiring
all paths to have a context prefix, and old configuration styles will
automatically rewrite the statements using this function.

For more details, please have a look at the complete
[draft](#35050)
implementation.

<!-- Issue number (e.g. #1234) or full URL to issue, if applicable. -->
#### Link to tracking issue

#29017

<!--Describe what testing was performed and which tests were added.-->
#### Testing
Unit tests

<!--Describe the documentation added.-->
#### Documentation
No changes

<!--Please delete paragraphs that you did not use before submitting.-->
ArthurSens pushed a commit to ArthurSens/opentelemetry-collector-contrib that referenced this issue Nov 4, 2024
… paths context (open-telemetry#35716)

<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
#### Description

This PR is part of
open-telemetry#29017,
and adds the `ottl.Parser[K].AppendStatementPathsContext` function,
allowing components to rewrite statements appending missing `ottl.path`
context names.

For examples, the following context-less statement:

```
set(value, 1) where name == attributes["foo.name"]
```

Would be rewritten using the `span` context as:

```

set(span.value, 1) where span.name == span.attributes["foo.name"]
```

**Why do we need to rewrite statements?**

This utility will be used during the transition from structured OTTL
statements to flat statements.
Components such as the `transformprocessor` will leverage it to support
both configuration styles, without forcing
users to adapt/rewrite their existing config files. 

Once the component turns on the `ottl.Parser[K]` path's context
validation, new configuration style usages will be validated, requiring
all paths to have a context prefix, and old configuration styles will
automatically rewrite the statements using this function.

For more details, please have a look at the complete
[draft](open-telemetry#35050)
implementation.

<!-- Issue number (e.g. open-telemetry#1234) or full URL to issue, if applicable. -->
#### Link to tracking issue

open-telemetry#29017

<!--Describe what testing was performed and which tests were added.-->
#### Testing
Unit tests

<!--Describe the documentation added.-->
#### Documentation
No changes

<!--Please delete paragraphs that you did not use before submitting.-->
sbylica-splunk pushed a commit to sbylica-splunk/opentelemetry-collector-contrib that referenced this issue Dec 17, 2024
… paths context (open-telemetry#35716)

<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
#### Description

This PR is part of
open-telemetry#29017,
and adds the `ottl.Parser[K].AppendStatementPathsContext` function,
allowing components to rewrite statements appending missing `ottl.path`
context names.

For examples, the following context-less statement:

```
set(value, 1) where name == attributes["foo.name"]
```

Would be rewritten using the `span` context as:

```

set(span.value, 1) where span.name == span.attributes["foo.name"]
```

**Why do we need to rewrite statements?**

This utility will be used during the transition from structured OTTL
statements to flat statements.
Components such as the `transformprocessor` will leverage it to support
both configuration styles, without forcing
users to adapt/rewrite their existing config files. 

Once the component turns on the `ottl.Parser[K]` path's context
validation, new configuration style usages will be validated, requiring
all paths to have a context prefix, and old configuration styles will
automatically rewrite the statements using this function.

For more details, please have a look at the complete
[draft](open-telemetry#35050)
implementation.

<!-- Issue number (e.g. open-telemetry#1234) or full URL to issue, if applicable. -->
#### Link to tracking issue

open-telemetry#29017

<!--Describe what testing was performed and which tests were added.-->
#### Testing
Unit tests

<!--Describe the documentation added.-->
#### Documentation
No changes

<!--Please delete paragraphs that you did not use before submitting.-->
TylerHelmuth pushed a commit that referenced this issue Dec 19, 2024
#36869)

<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
#### Description

This PR is part of
#29017,
and a spin-off from
#36820.
It changes the existing context inferrer logic to also take into
consideration the functions and enums used on the statements.
(#36820 (comment))

New logic:
- Find all `path`, function names(`editor`, `converter`), and
`enumSymbol` on the statements
- Pick the highest priority context (same existing logic) based on the
`path.Context` values
- If the chosen context does not support all used functions and enums,
it goes through it's lower contexts (wide scope contexts that does
support the chosen context as a path context) testing them and choosing
the first one that supports them.
- If no context that can handle the functions and enums is found, the
inference fail and an empty value is returned.


The parser collection was adapted to support the new context inferrer
configuration requirements.

**Other important changes:**

Currently, it's possible to have paths to contexts root objects, for
example: `set(attributes["body"], resource)`. Given `resource` has no
dot separators on the path, the grammar extracts it into the
`path.Fields` slice, letting the `path.Context` value empty. Why? This
grammar behaviour is still necessary to keep backward compatibility with
paths without context, otherwise it would start requiring contexts for
all paths independently of the parser configuration.

Previous PRs didn't take this edge case into consideration, and a few
places needed to be changed to address it:
 - Context inferrer (`getContextCandidate`)
 - Parser `prependContextToStatementPaths` function.
- Reusable OTTL contexts (`contexts/internal`) (not part of this PR, it
will be fixed by
#36820)

When/If we reach the point to deprecate paths _without_ context, all
those conditions can be removed, and the grammar changed to require and
extract the `path` context properly.
 
<!-- Issue number (e.g. #1234) or full URL to issue, if applicable. -->
#### Link to tracking issue

#29017

<!--Describe what testing was performed and which tests were added.-->
#### Testing
Unit tests

<!--Describe the documentation added.-->
#### Documentation
No changes

<!--Please delete paragraphs that you did not use before submitting.-->
AkhigbeEromo pushed a commit to sematext/opentelemetry-collector-contrib that referenced this issue Jan 13, 2025
open-telemetry#36869)

<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
#### Description

This PR is part of
open-telemetry#29017,
and a spin-off from
open-telemetry#36820.
It changes the existing context inferrer logic to also take into
consideration the functions and enums used on the statements.
(open-telemetry#36820 (comment))

New logic:
- Find all `path`, function names(`editor`, `converter`), and
`enumSymbol` on the statements
- Pick the highest priority context (same existing logic) based on the
`path.Context` values
- If the chosen context does not support all used functions and enums,
it goes through it's lower contexts (wide scope contexts that does
support the chosen context as a path context) testing them and choosing
the first one that supports them.
- If no context that can handle the functions and enums is found, the
inference fail and an empty value is returned.


The parser collection was adapted to support the new context inferrer
configuration requirements.

**Other important changes:**

Currently, it's possible to have paths to contexts root objects, for
example: `set(attributes["body"], resource)`. Given `resource` has no
dot separators on the path, the grammar extracts it into the
`path.Fields` slice, letting the `path.Context` value empty. Why? This
grammar behaviour is still necessary to keep backward compatibility with
paths without context, otherwise it would start requiring contexts for
all paths independently of the parser configuration.

Previous PRs didn't take this edge case into consideration, and a few
places needed to be changed to address it:
 - Context inferrer (`getContextCandidate`)
 - Parser `prependContextToStatementPaths` function.
- Reusable OTTL contexts (`contexts/internal`) (not part of this PR, it
will be fixed by
open-telemetry#36820)

When/If we reach the point to deprecate paths _without_ context, all
those conditions can be removed, and the grammar changed to require and
extract the `path` context properly.
 
<!-- Issue number (e.g. open-telemetry#1234) or full URL to issue, if applicable. -->
#### Link to tracking issue

open-telemetry#29017

<!--Describe what testing was performed and which tests were added.-->
#### Testing
Unit tests

<!--Describe the documentation added.-->
#### Documentation
No changes

<!--Please delete paragraphs that you did not use before submitting.-->
TylerHelmuth pushed a commit that referenced this issue Jan 16, 2025
…37271)

<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
#### Description
The `transformprocessor` is currently using the
`internal/filter/filterottl` to create `conditions` expressions. In
order to support path's with context prefix, it needs to configure the
underline `ottl.Parser` using their `EnablePathContextNames` options.
This PR changes this internal API adding new functions that accepts the
parsers options as argument.


<!-- Issue number (e.g. #1234) or full URL to issue, if applicable. -->
#### Link to tracking issue
Related to
#29017

<!--Describe what testing was performed and which tests were added.-->
#### Testing
Unit tests

<!--Please delete paragraphs that you did not use before submitting.-->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
never stale Issues marked with this label will be never staled and automatically removed pkg/ottl priority:p2 Medium
Projects
None yet
Development

No branches or pull requests

3 participants