Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Labels Optional for CounterMetric::add #1032

Merged
merged 12 commits into from
May 13, 2020

Conversation

astorm
Copy link
Contributor

@astorm astorm commented May 7, 2020

Which problem is this PR solving?

This solves the problem described in #1031 -- where not providing labels to a counter metric results in an error.

Short description of the changes

This PR changes the method signature of the add method on the CounterMetric class to provide a default value for the labels parameter. This is done so that the hashLabels utility function does not throw an error when trying to pass undefined to Object.keys(labels);

This PR also adds a new unit test to exercise the default option.

@@ -133,7 +133,7 @@ export class CounterMetric extends Metric<BoundCounter> implements api.Counter {
* @param labels key-values pairs that are associated with a specific metric
* that you want to record.
*/
add(value: number, labels: api.Labels) {
add(value: number, labels: api.Labels = {}) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is nice, it would be nice if you can do the same thing to measure.record (/~https://github.com/astorm/opentelemetry-js/blob/astorm/labels-optional/packages/opentelemetry-metrics/src/Metric.ts#L166)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just realized, Prometheus exporter would fail after this change. Especially below line due to toString method on undefined:

...record.descriptor.labelKeys.map(k => record.labels[k].toString())

I would prefer to fix it in this same PR and do some end to end manual testing. WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, in order to pass the build you need to make the same change in the API package.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the guidance and patience @mayurkale22 -- I'm new to the repo and the project and I'm still getting my sea legs.

Fixing this for the MeasureMetric in this PR makes sense. Fixing the prometheus exporter in this PR also makes sense. I presume the change in opentelemetry-api would be to the interfaces, and that also makes sense to do in this PR.

I'll ping -- you? -- via a PR comment when that's done. If there's more to The Process™ please let me know.

Best wishes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM, just add a comment on this thread when new changes are ready to review. I usually use this example to test against prom. exporter.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mayurkale22 not sure why you think the prometheus exporter would fail on that line. Can you explain? I'm hesitant to approve changes to the _registerMetric routine in the prometheus exporter because the prom-client has some surprising behavior, and that is the logic we use to destroy and recreate the counters with the new values.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mayurkale22 not sure why you think the prometheus exporter would fail on that line. Can you explain?

Looks like @astorm already mention the reason, here-> #1032 (comment) (due to toString method on undefined label value).

I'm hesitant to approve changes to the _registerMetric routine in the prometheus exporter because the prom-client ....

I would suggest to add new test in prom. exporter with some LabelKeys but w/o Labels (values). You can reuse existing test (like this one), just dont bind the labels to metric.

@mayurkale22 mayurkale22 added bug Something isn't working size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels May 7, 2020
@astorm astorm marked this pull request as draft May 7, 2020 01:15
@astorm
Copy link
Contributor Author

astorm commented May 7, 2020

@mayurkale22 (or others) -- I've

  • updated the MeasureMetric.record signature so labels is optional with a default value of {} -- existing test coverage seems sufficient here, so I did not add an additional unit test.

  • updated interface Counter and interface Measure to make labels an optional parameter there as well

Regarding Prometheus -- I left a question in the gitter room about this, but I can't get the prometheus exporter package to build cleanly. When running npm install from the root, I end up with the following results. TLDR -- is seems like the prometheus package at master can't compile right now.

test/prometheus.test.ts:527:5 - error TS7027: Unreachable code detected.

527     done();
        ~~~~~~~


Found 2 errors.

npm ERR! code ELIFECYCLE
npm ERR! errno 1
npm ERR! @opentelemetry/exporter-prometheus@0.7.0 compile: `npm run version:update && tsc -p .`
npm ERR! Exit status 1
npm ERR! 
npm ERR! Failed at the @opentelemetry/exporter-prometheus@0.7.0 compile script.
npm ERR! This is probably not a problem with npm. There is likely additional logging output above.

npm ERR! A complete log of this run can be found in:
npm ERR!     /Users/alanstorm/.npm/_logs/2020-05-07T03_59_24_975Z-debug.log
lerna info lifecycle @opentelemetry/exporter-collector@0.7.0~prepare: @opentelemetry/exporter-collector@0.7.0
lerna info lifecycle @opentelemetry/exporter-jaeger@0.7.0~prepare: @opentelemetry/exporter-jaeger@0.7.0
lerna info lifecycle @opentelemetry/exporter-zipkin@0.7.0~prepare: @opentelemetry/exporter-zipkin@0.7.0
lerna info lifecycle @opentelemetry/node@0.7.0~prepare: @opentelemetry/node@0.7.0
lerna info lifecycle @opentelemetry/shim-opentracing@0.7.0~prepare: @opentelemetry/shim-opentracing@0.7.0
lerna info lifecycle @opentelemetry/test-utils@0.7.0~prepare: @opentelemetry/test-utils@0.7.0
lerna info lifecycle @opentelemetry/web@0.7.0~prepare: @opentelemetry/web@0.7.0
lerna info lifecycle @opentelemetry/exporter-prometheus@0.7.0~prepare: Failed to exec prepare script.  

Before I go off on some yak shave, does anyone reading have any advice on how I might proceed to a clean build. One I have a clean build I can make a pass at fixing the prometheus exporter so it doesn't assume counter metrics have labels.

@codecov-io
Copy link

codecov-io commented May 7, 2020

Codecov Report

Merging #1032 into master will decrease coverage by 16.47%.
The diff coverage is n/a.

@@             Coverage Diff             @@
##           master    #1032       +/-   ##
===========================================
- Coverage   95.09%   78.61%   -16.48%     
===========================================
  Files         212       72      -140     
  Lines        8903     1791     -7112     
  Branches      801      252      -549     
===========================================
- Hits         8466     1408     -7058     
+ Misses        437      383       -54     
Impacted Files Coverage Δ
packages/opentelemetry-api/src/metrics/Metric.ts 100.00% <ø> (ø)
packages/opentelemetry-core/test/utils/url.test.ts 0.00% <0.00%> (-100.00%) ⬇️
...ckages/opentelemetry-core/test/platform/id.test.ts 0.00% <0.00%> (-100.00%) ⬇️
...s/opentelemetry-core/src/platform/node/sdk-info.ts 0.00% <0.00%> (-100.00%) ⬇️
...pentelemetry-core/src/platform/node/performance.ts 0.00% <0.00%> (-100.00%) ⬇️
...pentelemetry-core/test/internal/validators.test.ts 0.00% <0.00%> (-100.00%) ⬇️
...ntelemetry-core/test/trace/NoRecordingSpan.test.ts 0.00% <0.00%> (-100.00%) ⬇️
...telemetry-core/test/platform/hex-to-base64.test.ts 0.00% <0.00%> (-100.00%) ⬇️
...elemetry-core/test/trace/spancontext-utils.test.ts 0.00% <0.00%> (-100.00%) ⬇️
...lemetry-core/test/trace/ProbabilitySampler.test.ts 0.00% <0.00%> (-100.00%) ⬇️
... and 142 more

@mayurkale22
Copy link
Member

test/prometheus.test.ts:527:5 - error TS7027: Unreachable code detected.

I tried repro'd it locally, seems working fine for me.

@astorm
Copy link
Contributor Author

astorm commented May 7, 2020

@mayurkale22 OK -- I was able to work around (but not fix) my repo/learna issues and fix the Prometheus exporter for the case where A metric's been defined with set of label keys, but values are not provided when recording the metric (either by passing a {} to add, or by using this new default value and passing nothing to add)

My main interest in this PR is being able to create a simple counter with no labels and increment it. As this PR sits right now, I think we've accomplished that, but if there's more to be done here please let me know.

Also -- just to say it out loud -- there seems to be a non-symmetry to the API design that exists regardless of this PR. The fact you can define a set of label keys for a metric, but then not define values for those keys when recording a metric creates some ambiguity as far as what the correct behavior should be (Record with undefined values for that key(s)? Throw an error because the keys are there? Log an error but still record? Log an error but don't record?) I'm not sure if anything can/should be done about this, but if there is it feels outside the scope of this small PR.

Thanks again for considering this and for the help getting oriented in the repo.

@astorm astorm marked this pull request as ready for review May 7, 2020 13:26
@dyladan
Copy link
Member

dyladan commented May 7, 2020

@astorm since you're in a weird state I would suggest a complete clean and rebootstrap

MAKE SURE YOU HAVE COMMITTED ANYTHING YOU WANT TO KEEP BEFORE DOING THIS

# from the root of the project
lerna clean -y
git clean -fdx
npm install # This will cause the project to be built and linked by npm scripts

@astorm
Copy link
Contributor Author

astorm commented May 7, 2020

@dyladan Thank you for the advice -- I gave that a try and still ran similar issues, and I tried a fresh git clone in a separate directory and ran into similar issues. The fresh checkout makes me think this isn't a problem of the repo being in some bad state, but rather there's some you'all take for granted about your environments that I'm not aware of .

@dyladan question -- I noticed in the sample command you gave me that you're running lerna globally. I don't have lerna installed globally and was relying on the version installed in the project. Could you let me know what version of lerna you have installed globally so I can attempt to match your environment? (also -- what version of node are you using?)

@dyladan
Copy link
Member

dyladan commented May 7, 2020

Oh sorry yes I do have lerna installed globally. The issue you're seeing looks like a compilation issue. It's possible that the @types/node package updated and you just happened to be the unlucky person to be affected first.

$ lerna --version
3.20.2
$ node --version
v12.14.1

@dyladan
Copy link
Member

dyladan commented May 7, 2020

I am able to reproduce your issue on a complete rebuild.

@dyladan dyladan mentioned this pull request May 7, 2020
@astorm
Copy link
Contributor Author

astorm commented May 7, 2020

I am able to reproduce your issue on a complete rebuild.

Yay, I am not any more insane that any other software engineer :) And I learned the project's CircleCI tests don't include a full compile cycle, so that's a good thing to know.

Regardless of these build issues, this PR is ready for review.

@dyladan
Copy link
Member

dyladan commented May 7, 2020

The CI uses a cached set of dependencies because the dep install takes so long and takes so many resources. It's unfortunate, but it is the way it is. Changing any package.json causes the cache key to change and a full build in CI.

@dyladan dyladan added the API label May 7, 2020
@@ -184,7 +184,9 @@ export class PrometheusExporter implements MetricExporter {
*/
if (metric instanceof Counter) {
metric.remove(
...record.descriptor.labelKeys.map(k => record.labels[k].toString())
...record.descriptor.labelKeys.map(k =>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In what cases would record.labels[k] be missing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my manual testing last night this change prevented an error being thrown when a user instantiates a counter with a label key

const monotonicCounter = meter.createCounter('monotonic_counter', {
  monotonic: true,
  labelKeys: ['pid'],
  description: 'Example of a monotonic counter',
});

but then fails to provide a value for that key.

monotonicCounter.add(1, {})
monotonicCounter.add(1)    

@dyladan With my build issues sorted/understood I can (depending on how your conversation with @mayurkale22 goes) either add tests that make the use-case here clear or revert this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I understand. Instead of falling back to an empty string, I think I would prefer a filter on the list.

      metric.remove(
        ...record.descriptor.labelKeys
          .filter(k => record.labels[k] != null)
          .map(k => record.labels[k].toString())
      );

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is clever solution but feel like it won't work as expected.

AFAIK prom-client will throw "Invalid number of arguments" exception if length of keys != length of values. See: /~https://github.com/siimon/prom-client/blob/master/lib/util.js#L34

One option is to fall back on "value missing or unknown" as a default value when the corresponding value is not defined in any given key(s), similar to the description field. WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Option 2: Instead of blindly attaching our Labelkeys to prom-client's labelNames, we can filter the keys based on valid value.

This is what python has implemented: /~https://github.com/open-telemetry/opentelemetry-python/blob/master/ext/opentelemetry-ext-prometheus/src/opentelemetry/ext/prometheus/__init__.py#L146-L150

Copy link
Contributor Author

@astorm astorm May 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your feedback @mayurkale22

WDYT?

IT I've been wrangled into an bug fix that is just outside the scope of my original PR -- but I can live with that :)

Re: option 2 -- when you say

Option 2: Instead of blindly attaching our Labelkeys to prom-client's labelNames, we can filter the keys based on valid value.

My interpretation of this is it's a suggestion that, when it comes time to create the Prometheus metric, we look at the keys and values of the passed in Open Telemetry metric, remove any keys with "non-valid" values (undefined, null, -- others?) and create the Prometheus metric with this new, smaller, set of keys.

If that's the suggestion -- I don't think this is the right course of action. Label keys seem like a fundamental part of a metric. That is -- a metric with the keys [foo, bar, baz] seems like a different metric than one with the keys [foo, bar]. Removing keys doesn't seems like the right course of action.

Give than, Option 1 seems like the better approach.

One option is to fall back on "value missing or unknown" as a default value when the corresponding value is not defined in any given key(s),

You didn't indicate whether you thought this value should be added when handling the Prometheus metric, or when recording the Open Telemetry metric. My assumption is the former, as having this extra work happen every time a metric is recorded would be excessive.

Given all that: I intend to change how Prometheus metrics are instantiated. If a value isn't present for a specific key, I will use a constant string of value missing for its value. I will add tests to cover this case. I'll check the spec for what constitutes a valid values, and make judgment call if there are ambiguities there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dyladan would like you know your thoughts on this?

IT I've been wrangled into an bug fix that is just outside the scope of my original PR

Perhaps we can merge this PR and address prometheus exporter issue in the separate PR (btw, this won't make master branch unstable but we need to apply fix before next release). If we decide to merge, I would suggest to revert prometheus related to changes from this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think I would rather see it as two separate PRs simply for the purpose of having better history later when we have to go spelunking through PRs to discover why decisions were made. Not a strong preference though. Agree the prom changes should be reverted if this is going to merge without a proper fix.

@@ -133,7 +133,7 @@ export class CounterMetric extends Metric<BoundCounter> implements api.Counter {
* @param labels key-values pairs that are associated with a specific metric
* that you want to record.
*/
add(value: number, labels: api.Labels) {
add(value: number, labels: api.Labels = {}) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mayurkale22 not sure why you think the prometheus exporter would fail on that line. Can you explain? I'm hesitant to approve changes to the _registerMetric routine in the prometheus exporter because the prom-client has some surprising behavior, and that is the logic we use to destroy and recreate the counters with the new values.

@astorm astorm marked this pull request as draft May 8, 2020 15:08
@astorm astorm marked this pull request as ready for review May 8, 2020 19:04
Copy link
Member

@dyladan dyladan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@mayurkale22
Copy link
Member

@open-telemetry/javascript-approvers We need more reviews for this one!

Copy link
Member

@obecny obecny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, one minor comment

packages/opentelemetry-metrics/test/Meter.test.ts Outdated Show resolved Hide resolved
@dyladan dyladan added the Merge:LGTM This PR is ready to be merged by a Maintainer (has enough valid approvals, successful build, etc.) label May 13, 2020
@dyladan dyladan removed the Merge:LGTM This PR is ready to be merged by a Maintainer (has enough valid approvals, successful build, etc.) label May 13, 2020
@mayurkale22 mayurkale22 merged commit 48de7ff into open-telemetry:master May 13, 2020
dyladan added a commit to dyladan/opentelemetry-js that referenced this pull request Sep 9, 2022
pichlermarc pushed a commit to dynatrace-oss-contrib/opentelemetry-js that referenced this pull request Dec 15, 2023
* fix: homepage links

* fix: add homepage for redis-4

Co-authored-by: Amir Blum <amir@aspecto.io>
martinkuba pushed a commit to martinkuba/opentelemetry-js that referenced this pull request Mar 13, 2024
* fix: homepage links

* fix: add homepage for redis-4

Co-authored-by: Amir Blum <amir@aspecto.io>
martinkuba pushed a commit to martinkuba/opentelemetry-js that referenced this pull request Mar 16, 2024
* fix: homepage links

* fix: add homepage for redis-4

Co-authored-by: Amir Blum <amir@aspecto.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants