-
Notifications
You must be signed in to change notification settings - Fork 896
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Logs: Allow the registered processor to be an independent pipeline #4010
Comments
CC @open-telemetry/specs-logs-approvers |
I think it is designed in the same way in Rust Logs SDK (alpha status): /~https://github.com/open-telemetry/opentelemetry-rust/blob/main/opentelemetry-sdk/src/logs/log_processor.rs @open-telemetry/rust-maintainers, can you please confirm? |
In the current otel-rust implementation, LogRecord copy/clone is passed to each processor, and they can modify it individually without affecting the other processor. However, there are some discussions to change it to the mutable reference to enable one processor result to be passed to another. |
And to add, the |
This is also one of the reasons behind the current OTel Go LogProcessor design. I believe for other languages that are more "reference type oriented" this proposal may seem strange where it is not common to have so much care around memory allocations. However, I find it very important at least for C++, Go, Rust. My proposal is to update the specification to allow such SDK design/implementation for (at least) performance reasons. |
We can also say that if the "log record processor" is designed in such way that it accepts log records by value then it should be named "log record handler" (instead of "processor") to make the processing difference more visible. |
I wanted to add that I just don't like how Processor is seemingly different for Logs than Spans. I'd much rather see Log Processors match Span Processors. I also noticed Log Processors have similar language to what was once in Span Processor's spec but was removed, "The SDK MUST allow users to implement and configure custom processors and decorate built-in processors for advanced scenarios such as enriching with attributes." -- Span Processors also used to talk about "decorators" but it was removed because it was never fleshed out. Defining decorators, or whatever, and how individual pipelines are built for spans and logs that can mutate the data as it flows through the individual pipeline makes much more sense to me than to have Log Processors get a ReadWriteRecord. If nothing else maybe that "decorator" wording should be removed as it sounds like you are to use Processors themselves to "decorate" logs? Does anyone implement "decorators" on logs? I'd also expect the diagram for Log Processors to be different from Span Processors to make clear that log records may change. Right now it looks as if there are 2 Processors (simple and batch) each receiving the exact same LogRecord that they then pass to an exporter. I know it is be showing that the processor could be one or the other and not that they both are in use there, but I think it could be misinterpreted. |
This was discussed in spec meeting and also in the TC meeting. I believe the spec is not clear enough and can be interpreted in different ways leading to very different implementations (e.g. chained processors that can mutate data and see mutations of previous processor or parallel processors that don't see mutations of other processors). We likely need to fix the spec. It is in a stable doc but is vague enough that I think it is a bug and should be fixed. Since the spec is vague I am also worried that existing implementations in different language SDKs may have interpreted it differently and have made different choices. IMO, we need to start with auditing existing implementations and see where we stand currently and then decide how exactly we want to fix the spec (one of the options is that both "chained-allow-mutation" and "fanout-copy-to-parallel" should be possible like they are in the Collector pipelines). CC @jmacd @jack-berg if you want to add something. |
Thank you @tigrannajaryan, I think this is a good summary. I have been working on problems associated with Samplers and have identified a number of connected issues. There is a strong desire to improve the ways that we compose SDKs and we recognize the need to improve our specification for how (a) components are plugins for modifying telemetry as-it-happens, such that all observers see their side-effects, (b) components are composed into independent pipelines capable of filtering and mutating data in pipeline-specific ways, (c) SDKs are permitted to offer memory-optimized SDK-APIs that recognize both of these factors (e.g., use of structs vs references). I volunteered to work on this topic and will return with a proposal for fixing this bug in the spec. Thanks for bringing this up in the first place, @pellared. |
Notice that the implementations which use a "log record references" (like Java, JavaScript, Python, .NET) can also achieve independent processor pipeline by creating a processor decorator which makes a deep-copy of the record and pass it to the decorated processor. My proposal is to leave the implementation choice to the language SIG. |
During the 05/14/2024 Spec SIG meeting @jmacd summarized:
Should I create a PR to include this in the Logs SDK spec? I think this is inline with #4010 (comment) |
Before proposing anything to the spec I plan to also explore the consequences of defining the OnEmit(ctx context.Context, record Record) (Record, error) instead of OnEmit(ctx context.Context, record Record) error The the SDK would pass the record returned by a processor to the subsequent registered processor. |
Assigning myself per conversation with @jmacd. |
Personally, I find OnEmit(ctx context.Context, record Record) (Record, error) as a bad design as the user have two ways of returning a mutated log record
The one bootstrapping the logger provider would have to know how it working to make sure it is properly composed to achieved the desired log processing pipeline. Additionally, I feel it is strange that "sink processors" such as BatchProcessor nad SimpleProcessor would have to return anything. I feel that just relying on decorators for processing is more consistent. If the users would like to mutate the record for many processors they could simply use/create a FanOutProcessor: type FanoutProcessor struct {
processors []log.Processor
}
// Fanout distributes records to multiple log processors.
func Fanout(processors...log.Processor) *FanoutProcessor {
return &FanoutProcessor{
processors : processors,
}
}
func (p *FanoutProcessor) OnEmit(ctx context.Context, record Record) error {
for _, processor := range p.processors {
processor.OnEmit(ctx, record)
}
} Then the usage could look like following (mutate for all exporters): log.NewLoggerProvider(
log.WithProcessor(mutatingProcessor{ Fanout( // both exporters have mutated log records
log.NewSimpleProcessor(stdoutExporter),
log.NewBatchProcessor(otlpExporter),
}),
) Other users could still use the same log.NewLoggerProvider(
log.WithProcessor(mutatingProcessor{ log.NewSimpleProcessor(stdoutExporter) }), // STDOUT exporter with simple processor, the log records' body fields are redacted
log.WithProcessor(log.NewBatchProcessor(otlpExporter)), // OTLP exporter with a batch processor
) When following this design, the there is only one way of "processing/mutating" and the one who composes the log provider decides how the log processing pipeline should look like. I want to also say that I find such design idiomatic as it follows the design of https://pkg.go.dev/log/slog#Handler and creating pipelines is very similar. What is more, the official Go Wiki has /~https://github.com/samber/slog-multi in its Resources for slog which uses the same patterns for composing log pipelines/handlers. This is also the way how stable OTel C++ and alpha OTel Rust are designed. |
@open-telemetry/technical-committee, is this still
triage:deciding:community-feedback
EDIT: I created #4062 |
## Why Fixes #4010 (read the issue for more details) The SDK SHOULD offer a way to make independent log processing. I find it as a missing feature that should be covered by by the specification. ## What This PR tries to achieve it in the following way: - SDK SHOULD provide an isolating processor allowing independent log processing pipelines. Source: #4010 (comment) This approach should be solving the issue at least for Java, .NET, Python, JavaScript, Go. This proposal should not make any existing log SDKs not compliant with the specification. I think that the same approach could be also adopted in tracing for span processors (but I have not yet investigated it in depth). ## Remarks I created a separate issue for specifying whether (by default) a change to a record made in a processor is propagated to next registered processor. See #4065
## Why Fixes open-telemetry#4010 (read the issue for more details) The SDK SHOULD offer a way to make independent log processing. I find it as a missing feature that should be covered by by the specification. ## What This PR tries to achieve it in the following way: - SDK SHOULD provide an isolating processor allowing independent log processing pipelines. Source: open-telemetry#4010 (comment) This approach should be solving the issue at least for Java, .NET, Python, JavaScript, Go. This proposal should not make any existing log SDKs not compliant with the specification. I think that the same approach could be also adopted in tracing for span processors (but I have not yet investigated it in depth). ## Remarks I created a separate issue for specifying whether (by default) a change to a record made in a processor is propagated to next registered processor. See open-telemetry#4065
What are you trying to achieve?
For Go Logs SDK, we want to have each registered processor to work as a separate, independent log record processing pipeline.
The records passed to
OnEmit
are passed by value (not by pointer). Therefore, any modification made to the record passed to a processor'sOnEmit
method will not be applied to the subsequent registered processor.Users can use processor decorators to pass a modified log record to a wrapped processor, e.g.:
Thanks to it, the users can register two pipelines which do not interfere with each other. The user can have one pipeline which redacts secrets and exports (without batching) to stdout and then a second pipeline which exports via OTLP with batching (and without redaction):
What is more the Processor could be used then also to use it for sampling decisions. E.g. this is how a "minimum severity level processor" could look like: /~https://github.com/open-telemetry/opentelemetry-go/blob/65e032dd16b20e376eae5a7a5956fa811783389b/sdk/log/min_sev.go
By having the Processors independent by default the user could still use different composition patterns (fan-in, fan-out, middleware) to get the desired processing pipeline similarly to what can be achieved e.g. using
slog.Handler
(example helpers: /~https://github.com/samber/slog-multi).At last, this design usually reduces the number of heap allocations (by 1 per each emit). If we pass a pointer to Record to allow mutations, then Go will allocate it on the heap (I was making a lot of experiments on the Processor interface and Record struct when working on open-telemetry/opentelemetry-go#4955).
Additional context.
The specification says that processors are part of the pipeline:
"part of a pipeline" is unclear.
The specification requires that log processors be provided a
LogRecordProcessor
:Does it mean that a modification done in one registered exporter MUST be applied (and visible) to the next registered processor?
If so then the OTel Go Logs SDK design would be against the specification. But then, how the user can have different processing for different exporters?
Or does it mean that we only mean to be able to modify the record locally so that e.g. we can pass it to a decorated processor?
If so then should we clarify the specification?
The proposed design is more usable, safer, more performant (for Go) compared to a design where we a log record is passed by a pointer instead passing by value. I believe that the proposed design is "in the spirit" of OpenTelemetry even though it would differ from the "tracing processors" design.
It is worth mentioning that this is the way how stable OTel C++ and alpha OTel Rust designed the Log Processors for performance reasons (to avoid unnecessary allocations).
Side note: This design would also allow more flexibility on handling
Logger.Enabled
if it would be supported in SDK by addingEnabled
method toProcessor
(related to #3917).Related issue: open-telemetry/opentelemetry-go#5219
Proposed solution
My proposal is to update the specification to allow such SDK design/implementation for (at least) performance reasons.
Alternative solution.
Even currently it should be possible to have different processing for different exporters and a processor for sampling. Nothing prevents users to create a processor decorator and "clone" records before calling wrapped processor (to mutate only for a "single" pipeline) or not call a wrapped processor (to achieve sampling).
I consider it more like a hack and for Go such design would be less performant as it would always cause the log record to be allocated on the heap.
I believe for other languages that are more "reference type oriented" it is not common to have so much care around memory allocations. However, I find it very important at least for C++, Go, Rust.
The text was updated successfully, but these errors were encountered: