Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(sdk-metrics-base): combine concurrent observable callback invocations #2785

Conversation

legendecas
Copy link
Member

Which problem is this PR solving?

As /~https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/metrics/api.md#asynchronous-counter-creation states:

OpenTelemetry API authors SHOULD define whether this callback function needs to be reentrant safe / thread safe or not.

This PR documents that the async observable callbacks SHOULD not be called concurrently in JavaScript.

When multiple (or single) metric readers initiate collect operation concurrently, the invocation of the callback of async observables is merged.

Fixes #2782

Short description of the changes

  • Add ObservableRegistry to combine observableCallbacks for multiple async metrics.
  • Call all ObservableCallbacks at once before metric collections.

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)

How Has This Been Tested?

  • Concurrently collect (2 collectors) with 1 observableCallback and 1 async metrics.
  • Concurrently collect (2 collectors) with 1 observableCallback and multiple async metrics (renamed with views).

Checklist:

  • Followed the style guidelines of this project
  • Unit tests have been added
  • Documentation has been updated

@legendecas legendecas requested a review from a team February 14, 2022 03:44
@codecov
Copy link

codecov bot commented Feb 14, 2022

Codecov Report

Merging #2785 (2116e45) into main (fd62fa4) will decrease coverage by 0.00%.
The diff coverage is n/a.

❗ Current head 2116e45 differs from pull request most recent head 2a2d3ff. Consider uploading reports for the commit 2a2d3ff to get more accurate results

@@            Coverage Diff             @@
##             main    #2785      +/-   ##
==========================================
- Coverage   93.43%   93.42%   -0.01%     
==========================================
  Files         159      161       +2     
  Lines        5452     5494      +42     
  Branches     1145     1149       +4     
==========================================
+ Hits         5094     5133      +39     
- Misses        358      361       +3     
Impacted Files Coverage Δ
...ckages/opentelemetry-sdk-metrics-base/src/utils.ts 79.41% <0.00%> (-16.59%) ⬇️
...y-sdk-metrics-base/src/state/AsyncMetricStorage.ts 95.45% <0.00%> (-4.55%) ⬇️
...ry-sdk-metrics-base/src/state/SyncMetricStorage.ts 100.00% <0.00%> (ø)
...metrics-base/src/state/MeterProviderSharedState.ts 100.00% <0.00%> (ø)
...try-sdk-metrics-base/src/state/MeterSharedState.ts 100.00% <0.00%> (ø)
...y-sdk-metrics-base/src/state/ObservableRegistry.ts 100.00% <0.00%> (ø)
...ckages/opentelemetry-sdk-metrics-base/src/Meter.ts 100.00% <0.00%> (+2.04%) ⬆️
...pentelemetry-sdk-metrics-base/src/MeterProvider.ts 62.50% <0.00%> (+5.00%) ⬆️
...emetry-core/src/platform/node/RandomIdGenerator.ts 93.75% <0.00%> (+6.25%) ⬆️

@@ -80,6 +80,9 @@ export interface Meter {

/**
* Creates a new `ObservableGauge` metric.
*
* The callback SHOULD not be called concurrently.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds like something we are telling the user not to do when it is the SDK that actually calls the callback

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, maybe we can update the words to be like "The callback is been called only when it finishes the previous invocation or timeouts."

const viewDescriptor = createInstrumentDescriptorWithView(view, descriptor);
const aggregator = view.aggregation.createAggregator(viewDescriptor);
const storage = new SyncMetricStorage(viewDescriptor, aggregator, view.attributesProcessor);
// TODO: handle conflicts
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an issue for this TODO so it doesn't get forgotten

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is being worked at #2734


constructor(public resource: Resource) {}

getMeterSharedState(instrumentationLibrary: InstrumentationLibrary) {
// TODO: meter identity
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have an issue for meter identity that can be referenced here so it isn't forgotten?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll update the comment to include a reference to #2593.

const promise = Promise.all(Array.from(this._callbackMap.keys()).map(observableCallback => {
const observableResult = new ObservableResult();
this._observableResultMemo.set(observableCallback, observableResult);
// TODO: timeout with callback
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what this TODO means. Can you open an issue for it and reference it in the comment or make the comment more descriptive?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a clarification request on spec at open-telemetry/opentelemetry-specification#2295 for this problem.

I'll update the comment.

@legendecas
Copy link
Member Author

The spec has open-telemetry/opentelemetry-specification#2361 to define that:

Callback functions SHOULD be reentrant safe. The SDK expects to evaluate callbacks for each MetricReader independently.

We should follow the latest definition. Closing for now.

@legendecas legendecas closed this Mar 7, 2022
@legendecas legendecas deleted the metrics-ff/concurrent-collect branch May 13, 2022 02:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Define reentrant safety of JavaScript Metrics API callbacks
2 participants