-
Notifications
You must be signed in to change notification settings - Fork 896
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add OTEL_TRACE_SAMPLER_TRACEIDRATIO_RATIO env variable definition #1190
Add OTEL_TRACE_SAMPLER_TRACEIDRATIO_RATIO env variable definition #1190
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - need typo (i think) fix. Rename OTEL_TRACE_SAMPLING_PROBABILITY to OTEL_TRACE_SAMPLER_TRACEIDRATIO_RATIO
The following from the spec makes this clear already right? SDK's may chose to support ENV var based option. Its not a MUST have. If they support it, then use the names from the spec. (I am happy with the wordings :) ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but a few typos & nits that should be fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
@@ -9,7 +9,8 @@ The goal of this specification is to unify the environment variable names betwee | |||
| OTEL_RESOURCE_ATTRIBUTES | Key-value pairs to be used as resource attributes | | See [Resource SDK](./resource/sdk.md#specifying-resource-information-via-an-environment-variable) for more details. | | |||
| OTEL_LOG_LEVEL | Log level used by the SDK logger | "info" | | | |||
| OTEL_PROPAGATORS | Propagators to be used as a comma separated list | "tracecontext,baggage" | Values MUST be deduplicated in order to register a `Propagator` only once. Unrecognized values MUST generate a warning and be gracefully ignored. | | |||
| OTEL_TRACE_SAMPLER | Sampler to be used for traces | "parentbased_always_on" | See [Sampling](./trace/sdk.md#sampling) | | |||
| OTEL_TRACE_SAMPLER | Sampler to be used for traces | "parentbased_always_on" | See [Sampling](./trace/sdk.md#sampling) | | |||
| OTEL_TRACE_SAMPLER_TRACEIDRATIO_RATIO | Sampling probability, a number in the [0..1] range, e.g. 0.25 | 0.01 | The specified value will only be used if OTEL_TRACER_SAMPLER is set to "traceidratio" or "parentbased_traceidratio" | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand how we got here, but isn't it way too long and obtuse? This doesn't feel very intuitive for users IMO - we even describe in the next column as Sampling probability. It seems like we want this to be called sampling probability and are trying too hard not to in some sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This a tricky one as, I suspect, there's no perfect solution. The previous name didn't suit anybody, but I think we are open to good alternatives.
(An alternative is to provide a general-purpose OTEL_TRACE_SAMPLER_PROBABILITY
which would work on any probability-based Sampler
, but that can get easily tricky IMHO).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think random_sample_rate / probability type of names could make sense, key word being random. Users don't really care whether the random is based on an inline computed random number or a random number that happens to be in the the trace id, so I would focus on that in the name. I don't think OTel needs to officially support more than one random sampler ever does it? Maybe this does bring up that the current traceidratio sampler is too specifically named and could just be "random". Seems like a nice change for users to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original name of TraceIdRatioBased was Probability, and it was deliberately renamed to say what it does, see #611. It is not really random. If you have many spans in the same trace, all will have the same sampling decision. Thus, TraceIdRatioBased will not help you against bad actors that send you requests with a known-sampled trace ID (related #1188).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see - well the main point that came up for me here is this name has trace appear twice, ratio appear twice, so lots of stuttering, and doesn't seem to indicate that it is picking a random sampling rate, one of the most common settings a tracing user applies in my experience, after service name. It just seems confusing.
(An alternative is to provide a general-purpose OTEL_TRACE_SAMPLER_PROBABILITY which would work on any probability-based Sampler, but that can get easily tricky IMHO).
Perhaps this is still an ok approach to make clear what the variable does? We don't actually automatically enable the sampler without the other env var too so it seems ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TC discussed this on the call today and the recommendation is to follow the pattern used in Jaeger SDK, which has two properties, SAMPLER_TYPE and SAMPLER_PARAM. The PARAM is numeric and is interpreted by the specific sampler as necessary, however for even more flexibility PARAM can be treated as a string, which allows it to even be JSON if a sampler requires more complex configuration.
I think @carlosalberto has an action item to propose alternative PR(s).
Ready for review. @cijothomas I created #1193 as a follow-up 😉 |
I suggest to close this in favor of #1202 which has more approvals and IMO is a nicer solution. |
Fixes #1105
Opened on behalf of #1117 - a few of the pending issues to discuss/address:
OTEL_PROPAGATORS
) are meant to be primarily used by Auto-Instrumentation, and not by SDKs yet.cc @jkwatson @cijothomas @Oberon00