-
Notifications
You must be signed in to change notification settings - Fork 642
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix instrument types in system metrics #2865
Conversation
Don't have much clue about metrics but the generated semantic conventions package provides some helper functions to creates these metrics, and they return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From specs:
I want to measure something (by reporting an absolute value):
- If the measurement values are non-additive, use an Asynchronous Gauge.
If the measurement values are additive:
- If the value is monotonically increasing - use an Asynchronous Counter.
- If the value is NOT monotonically increasing - use an Asynchronous UpDownCounter.
So I think the UpDownCounter
is correct no?
@xrmx @emdneto Thanks for taking a look at the PR. The issue is not with UpDownCounter, but the temporality used by ObservableUpDownCounter. From the definition, ObservableUpDownCounter is cumulative, while these metrics are delta.
Changing them from ObservableUpDownCounter to UpDownCounter should also fix the issue. I actually want to take back my words on using UpDownCounter. By checking the specification of UpDownCounter, it's more appropriate to use when the metrics are emitted using I understand it deviates from the semantic conventions, but I also want to ask for a second thought based on the actual collection process. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Taking a second look, the issue makes sense. According to the specs, UpDownCounter is suitable because the metrics are additive and not monotonically increasing, but in practice, for our asynchronous measurements (connections, memory, thread_count, etc), the cumulative temporality of the ObservableUpDownCounter doesn't seem to fit in that situation. We want to report an absolute value (current snapshot) that is not monotonic and non-cumulative.
In java it's using ObservableGauge.
@bjrara, would you be open to adding a failing test showing how this is fixing the issue with the cumulative temporality?
There's also another issue: The specs do not reference the gc_count
metric, but I totally agree with that change (from async counter to async gauge).
If there is already another language going out of spec we should open an issue in the semconv repo, no? |
Thanks for your feedback. Sure, checking how to add a failing test on the issue. |
As discussed in SIG (09/26) -- @bjrara will try using the LastValue aggregation in a View to configure the SDK to output this metric as a gauge. If it doesn't work, we should bring the issue to the semconv SIG. |
The proposed solution works for me. Thanks to @emdneto and @aabmass for the suggestions. I will update the issue with the summary why we believe the existing instrument types are right, and how to work around to report metrics to systems that only accept delta. |
Description
This PR fixes the instrument types when creating the system metrics.
Metrics that are impacted:
Collected from psutil.net_connections is a snapshot of the current state of network connections, thus should be delta and is not monotonic.
Collected from python gc is the current collection counts, thus should be delta and is not monotonic.
Collected from psutil.Process.memory_info is a snapshot of the process’s memory usage at the time of the call, thus should be delta and is not monotonic.
Collected from psutil.Process.num_threads is the number of threads currently used by this process (non cumulative).
Collected from psutil.Process.num_fds is the number of file descriptors currently opened by this process (non cumulative).
Change the instrument types for the metrics above to
observable_gauge
for correction.Fixes #2861
Type of change
Please delete options that are not relevant.
How Has This Been Tested?
Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration
delta
temporality for all the metric types.Does This PR Require a Core Repo Change?
Checklist:
See contributing.md for styleguide, changelog guidelines, and more.