Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Graceful Shutdown Support #1321

Merged
merged 69 commits into from
Aug 17, 2020
Merged
Show file tree
Hide file tree
Changes from 12 commits
Commits
Show all changes
69 commits
Select commit Hold shift + click to select a range
871d696
feat: graceful shutdown for tracing and metrics
jonahrosenblum Jul 16, 2020
da633ad
Merge remote-tracking branch 'upstream/master' into master
jonahrosenblum Jul 16, 2020
de161cc
fix: wording in test case
jonahrosenblum Jul 16, 2020
edd3b49
fix: typo
jonahrosenblum Jul 16, 2020
16c0faf
fix meterprovider config to use bracket notation
jonahrosenblum Jul 16, 2020
f0b9ef6
fix meterprovider config to use bracket notation
jonahrosenblum Jul 16, 2020
4bd85b5
fix: add callbacks to shutdown methods
jonahrosenblum Jul 17, 2020
86cc4e9
fix: merge conflict
jonahrosenblum Jul 17, 2020
5d93f6f
fix: merge conflict
jonahrosenblum Jul 17, 2020
9ad87e5
simplify meter shutdown code
jonahrosenblum Jul 17, 2020
1d3ea53
Merge remote-tracking branch 'upstream/master' into master
jonahrosenblum Jul 17, 2020
0a99520
fix: fix one-liner
jonahrosenblum Jul 17, 2020
791156b
private function name style fix
jonahrosenblum Jul 18, 2020
7d714b6
fix: naming of private member variables
jonahrosenblum Jul 18, 2020
b106b50
fix: graceful shutdown now works in browser
jonahrosenblum Jul 22, 2020
ba5be87
Merge remote-tracking branch 'upstream/master' into master
jonahrosenblum Jul 22, 2020
064c8e8
fix: window event listener will trigger once
jonahrosenblum Jul 22, 2020
54a82c5
fix: modify global shutdown helper functions
jonahrosenblum Jul 23, 2020
8fd53c8
fix: remove callback from remove listener args
jonahrosenblum Jul 23, 2020
b5af380
fix: change global shutdown function names and simplify functionality
jonahrosenblum Jul 23, 2020
e159546
fix: add rest of function refactoring and simplification
jonahrosenblum Jul 23, 2020
c0d3f5f
fix: remove unintended code snippet
jonahrosenblum Jul 23, 2020
9ceba0e
fix: refactor naming of listener cleanup function and fix sandbox issue
jonahrosenblum Jul 24, 2020
078802d
fix: make global shutdown cleanup local
jonahrosenblum Jul 24, 2020
adedac1
Merge remote-tracking branch 'upstream/master' into master
jonahrosenblum Jul 24, 2020
44a7275
fix: change interval of MeterProvider collection to ensure it does no…
jonahrosenblum Aug 3, 2020
a1bfb12
fix: resolve merge conflict
jonahrosenblum Aug 3, 2020
3dd51d2
Merge remote-tracking branch 'upstream/master' into master
jonahrosenblum Aug 3, 2020
d541837
chore: removing _cleanupGlobalShutdownListeners
obecny Aug 3, 2020
0163241
fix: remove unnecesary trace provider member function
jonahrosenblum Aug 4, 2020
95bb8bf
Merge remote-tracking branch 'upstream/master' into master
jonahrosenblum Aug 4, 2020
ef84611
Merge branch 'master' of /~https://github.com/open-telemetry/openteleme…
jonahrosenblum Aug 5, 2020
6f4de1e
Removing default span attributes (#1342)
aravinsiva Jul 27, 2020
5597957
feat: add baggage support to the opentracing shim (#918)
rubenvp8510 Jul 27, 2020
7c09806
Add nodejs sdk package (#1187)
dyladan Jul 27, 2020
b956e9c
feat: add OTEL_LOG_LEVEL env var (#974)
Jul 27, 2020
ab0de32
Proto update to latest to support arrays and maps (#1339)
obecny Jul 27, 2020
c4ffad7
chore: 0.10.0 release proposal (#1345)
dyladan Jul 27, 2020
32db1a9
fix: add missing grpc-js index (#1358)
dyladan Jul 28, 2020
db4745e
chore: 0.10.1 release proposal (#1359)
dyladan Jul 28, 2020
0a0a487
feat(api/context-base): change compile target to es5 (#1368)
markwolff Jul 31, 2020
802c5ad
Feat: Make ID generator configurable (#1331)
EdZou Jul 31, 2020
e6d49b4
fix: require grpc-js instead of grpc in grpc-js example (#1364)
reggiemcdonald Jul 31, 2020
5727e6e
chore(deps): update all non-major dependencies (#1371)
renovate-bot Jul 31, 2020
451c99c
chore: bump metapackage dependencies (#1383)
dyladan Aug 3, 2020
9336bfa
chore: 0.10.2 proposal (#1382)
dyladan Aug 3, 2020
bf46c1b
fix: remove unnecesary trace provider member function
jonahrosenblum Aug 4, 2020
62966b5
refactor(metrics): distinguish different aggregator types (#1325)
legendecas Aug 4, 2020
d07163a
Propagate b3 parentspanid and debug flag (#1346)
srjames90 Aug 4, 2020
f0323e6
feat: Export MinMaxLastSumCountAggregator metrics to the collector as…
davidwitten Aug 4, 2020
4d92d39
feat: Collector Metric Exporter for the Web (#1308)
davidwitten Aug 5, 2020
70bee58
Fix issues in TypeScript getting started example code (#1374)
mickdekkers Aug 5, 2020
ef046b9
chore: deploy canary releases (#1384)
dyladan Aug 6, 2020
5b14fb5
fix: addressing failed merge
jonahrosenblum Aug 7, 2020
73edc9a
Merge branch 'master' of /~https://github.com/jonahrosenblum/openteleme…
jonahrosenblum Aug 7, 2020
12b18fd
fix: protos pull
jonahrosenblum Aug 7, 2020
b0fdffd
fix: address marius' feedback
jonahrosenblum Aug 11, 2020
81f7dee
chore: merge with pr_1321
obecny Aug 11, 2020
93719b9
chore: deleting removeAllListeners from prometheus, fixing tests, cle…
obecny Aug 11, 2020
a0d309d
Merge branch 'master' into pr_1321_fix
obecny Aug 11, 2020
a28cfb0
Merge branch 'master' into master
obecny Aug 11, 2020
9c21dbb
Merge pull request #2 from obecny/pr_1321_fix
jonahrosenblum Aug 11, 2020
bfa1a0a
fix: add documentation and cleanup code
jonahrosenblum Aug 12, 2020
ec3ce12
fix: remove async label from shutdown and cleanup test case
jonahrosenblum Aug 12, 2020
49e1a69
fix: update controller collect to return promise
jonahrosenblum Aug 12, 2020
8afa9a1
fix: make downsides of disabling graceful shutdown more apparent
jonahrosenblum Aug 15, 2020
32439ab
Merge branch 'master' into master
jonahrosenblum Aug 17, 2020
529df9b
Merge branch 'master' into master
dyladan Aug 17, 2020
d5c430c
Merge branch 'master' into master
obecny Aug 17, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -184,15 +184,20 @@ describe('PrometheusExporter', () => {

describe('export', () => {
let exporter: PrometheusExporter;
let meterProvider: MeterProvider;
let meter: Meter;

beforeEach(done => {
exporter = new PrometheusExporter();
meter = new MeterProvider().getMeter('test-prometheus');
meterProvider = new MeterProvider();
meter = meterProvider.getMeter('test-prometheus', '1', {
exporter: exporter,
});
exporter.startServer(done);
});

afterEach(done => {
process.removeAllListeners('SIGTERM');
exporter.shutdown(done);
});

Expand Down Expand Up @@ -319,6 +324,69 @@ describe('PrometheusExporter', () => {
});
});

it('should export multiple labels on graceful shutdown', done => {
const counter = meter.createCounter('counter', {
description: 'a test description',
}) as CounterMetric;

counter.bind({ counterKey1: 'labelValue1' }).add(10);
counter.bind({ counterKey1: 'labelValue2' }).add(20);
counter.bind({ counterKey1: 'labelValue3' }).add(30);
process.once('SIGTERM', () => {
http
.get('http://localhost:9464/metrics', res => {
res.on('data', chunk => {
const body = chunk.toString();
const lines = body.split('\n');

assert.deepStrictEqual(lines, [
'# HELP counter a test description',
'# TYPE counter counter',
`counter{counterKey1="labelValue1"} 10 ${mockedTimeMS}`,
`counter{counterKey1="labelValue2"} 20 ${mockedTimeMS}`,
`counter{counterKey1="labelValue3"} 30 ${mockedTimeMS}`,
'',
]);

done();
});
})
.on('error', errorHandler(done));
});
process.kill(process.pid, 'SIGTERM');
});

it('should export multiple labels on manual shutdown', done => {
const counter = meter.createCounter('counter', {
description: 'a test description',
}) as CounterMetric;

counter.bind({ counterKey1: 'labelValue1' }).add(10);
counter.bind({ counterKey1: 'labelValue2' }).add(20);
counter.bind({ counterKey1: 'labelValue3' }).add(30);
meterProvider.shutdown(() => {
http
.get('http://localhost:9464/metrics', res => {
res.on('data', chunk => {
const body = chunk.toString();
const lines = body.split('\n');

assert.deepStrictEqual(lines, [
'# HELP counter a test description',
'# TYPE counter counter',
`counter{counterKey1="labelValue1"} 10 ${mockedTimeMS}`,
`counter{counterKey1="labelValue2"} 20 ${mockedTimeMS}`,
`counter{counterKey1="labelValue3"} 30 ${mockedTimeMS}`,
'',
]);

done();
});
})
.on('error', errorHandler(done));
});
});

it('should export a comment if no metrics are registered', done => {
exporter.export([], () => {
http
Expand Down Expand Up @@ -425,6 +493,7 @@ describe('PrometheusExporter', () => {

beforeEach(() => {
meter = new MeterProvider().getMeter('test-prometheus');
process.removeAllListeners('SIGTERM');
counter = meter.createCounter('counter') as CounterMetric;
counter.bind({ key1: 'labelValue1' }).add(10);
});
Expand Down
7 changes: 6 additions & 1 deletion packages/opentelemetry-metrics/src/Meter.ts
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ export class Meter implements api.Meter {
private readonly _batcher: Batcher;
private readonly _resource: Resource;
private readonly _instrumentationLibrary: InstrumentationLibrary;
private readonly _controller: PushController;

/**
* Constructs a new Meter instance.
Expand All @@ -55,7 +56,7 @@ export class Meter implements api.Meter {
// start the push controller
const exporter = config.exporter || new NoopExporter();
const interval = config.interval;
new PushController(this, exporter, interval);
this._controller = new PushController(this, exporter, interval);
}

/**
Expand Down Expand Up @@ -309,6 +310,10 @@ export class Meter implements api.Meter {
return this._batcher;
}

async shutdown(): Promise<void> {
await this._controller.shutdown();
}

/**
* Registers metric to register.
* @param name The name of the metric.
Expand Down
18 changes: 18 additions & 0 deletions packages/opentelemetry-metrics/src/MeterProvider.ts
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,9 @@ export class MeterProvider implements api.MeterProvider {
logger: this.logger,
resource: this.resource,
});
if (this._config.gracefulShutdown) {
process.once('SIGTERM', this.shutdownAllMeters.bind(this));
}
}

/**
Expand All @@ -54,4 +57,19 @@ export class MeterProvider implements api.MeterProvider {

return this._meters.get(key)!;
}

shutdown(cb: () => void = () => {}) {
jonahrosenblum marked this conversation as resolved.
Show resolved Hide resolved
this.shutdownAllMeters().then(() => {
setTimeout(cb, 0);
});
}

private async shutdownAllMeters() {
jonahrosenblum marked this conversation as resolved.
Show resolved Hide resolved
if (this._config.exporter) {
this._config.exporter.shutdown();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this._config.exporter.shutdown() actually blocking, or is this a missing callback or missing async decoration?
It may be out of scope to fix, but an issue could be created to look into this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think exporter shutdown calls are async. Regardless I'm not sure if it matters if this is blocking?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the exporters don't do any batching so it should be ok to have the shutdown be sync. On the other hand, it may be worth adding to the API just to allow the flexibility in the future.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically, the exporters could batch here: Say you have 1K unflushed spans, and the backend API supports 100 spans per batch call with a 500 ms limit between calls. -> shutdown would take 5 seconds.

I'm not sure it matters either. Perhaps something to potentially address in a future API design sprint.

}
await Promise.all(
jonahrosenblum marked this conversation as resolved.
Show resolved Hide resolved
Array.from(this._meters, ([_, meter]) => meter.shutdown())
);
}
}
20 changes: 14 additions & 6 deletions packages/opentelemetry-metrics/src/export/Controller.ts
Original file line number Diff line number Diff line change
Expand Up @@ -38,12 +38,20 @@ export class PushController extends Controller {
unrefTimer(this._timer);
}

private _collect() {
this._meter.collect();
this._exporter.export(this._meter.getBatcher().checkPointSet(), result => {
if (result !== ExportResult.SUCCESS) {
// @todo: log error
}
async shutdown() {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should probably add return value annotation

await this._collect();
}

private async _collect() {
await this._meter.collect().then(() => {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're already using async-await, I find it more readable to un-nest then() calls:

    await this._meter.collect();
    await this._exporter.export(
        this._meter.getBatcher().checkPointSet(),
        result => {
          if (result !== ExportResult.SUCCESS) {
            // @todo: log error
          }
        }
    );

In general I find it confusing how some parts of OT uses promises and some parts use callbacks, but that's out of scope for this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general I find it confusing how some parts of OT uses promises and some parts use callbacks, but that's out of scope for this PR.

This comes primarily from porting large parts of the implementation from open census, but writing other parts from scratch.

this._exporter.export(
this._meter.getBatcher().checkPointSet(),
result => {
if (result !== ExportResult.SUCCESS) {
// @todo: log error
}
}
);
});
}
}
4 changes: 4 additions & 0 deletions packages/opentelemetry-metrics/src/types.ts
Original file line number Diff line number Diff line change
Expand Up @@ -39,11 +39,15 @@ export interface MeterConfig {

/** Metric batcher. */
batcher?: Batcher;

/** Bool for whether or not graceful shutdown is enabled */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to highlight the pitfall of disabling the gracefulShutdown option.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a little more emphasis about why it's good to enable graceful shutdown.

gracefulShutdown?: boolean;
}

/** Default Meter configuration. */
export const DEFAULT_CONFIG = {
logLevel: LogLevel.INFO,
gracefulShutdown: true,
};

/** The default metric creation options value. */
Expand Down
42 changes: 42 additions & 0 deletions packages/opentelemetry-metrics/test/MeterProvider.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -15,10 +15,15 @@
*/

import * as assert from 'assert';
import * as sinon from 'sinon';
import { MeterProvider, Meter, CounterMetric } from '../src';
import { NoopLogger } from '@opentelemetry/core';

describe('MeterProvider', () => {
afterEach(() => {
process.removeAllListeners('SIGTERM');
});

describe('constructor', () => {
it('should construct an instance without any options', () => {
const provider = new MeterProvider();
Expand Down Expand Up @@ -73,4 +78,41 @@ describe('MeterProvider', () => {
assert.notEqual(meter3, meter4);
});
});

describe('shutdown()', () => {
it('should call shutdown when SIGTERM is received', () => {
const meterProvider = new MeterProvider();
const sandbox = sinon.createSandbox();
const shutdownStub1 = sandbox.stub(
meterProvider.getMeter('meter1'),
'shutdown'
);
const shutdownStub2 = sandbox.stub(
meterProvider.getMeter('meter2'),
'shutdown'
);
process.once('SIGTERM', () => {
sinon.assert.calledOnce(shutdownStub1);
sinon.assert.calledOnce(shutdownStub2);
sandbox.restore();
});
process.kill(process.pid, 'SIGTERM');
});

it('should not trigger shutdown if graceful shutdown is turned off', () => {
const meterProvider = new MeterProvider({
gracefulShutdown: false,
});
const sandbox = sinon.createSandbox();
const shutdownStub = sandbox.stub(
meterProvider.getMeter('meter1'),
'shutdown'
);
process.once('SIGTERM', () => {
sinon.assert.notCalled(shutdownStub);
sandbox.restore();
});
process.kill(process.pid, 'SIGTERM');
});
});
});
11 changes: 11 additions & 0 deletions packages/opentelemetry-tracing/src/BasicTracerProvider.ts
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,9 @@ export class BasicTracerProvider implements api.TracerProvider {
logger: this.logger,
resource: this.resource,
});
if (this._config.gracefulShutdown) {
process.once('SIGTERM', this.shutdownActiveProcessor.bind(this));
}
}

getTracer(name: string, version = '*', config?: TracerConfig): Tracer {
Expand Down Expand Up @@ -99,4 +102,12 @@ export class BasicTracerProvider implements api.TracerProvider {
api.propagation.setGlobalPropagator(config.propagator);
}
}

shutdown(cb: () => void = () => {}) {
this.activeSpanProcessor.shutdown(cb);
}

private shutdownActiveProcessor() {
this.activeSpanProcessor.shutdown();
}
}
1 change: 1 addition & 0 deletions packages/opentelemetry-tracing/src/config.ts
Original file line number Diff line number Diff line change
Expand Up @@ -38,4 +38,5 @@ export const DEFAULT_CONFIG = {
numberOfLinksPerSpan: DEFAULT_MAX_LINKS_PER_SPAN,
numberOfEventsPerSpan: DEFAULT_MAX_EVENTS_PER_SPAN,
},
gracefulShutdown: true,
};
3 changes: 3 additions & 0 deletions packages/opentelemetry-tracing/src/types.ts
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,9 @@ export interface TracerConfig {

/** Resource associated with trace telemetry */
resource?: Resource;

/** Bool for whether or not graceful shutdown is enabled */
gracefulShutdown?: boolean;
}

/**
Expand Down
37 changes: 37 additions & 0 deletions packages/opentelemetry-tracing/test/BasicTracerProvider.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -27,13 +27,18 @@ import {
} from '@opentelemetry/core';
import { Resource } from '@opentelemetry/resources';
import * as assert from 'assert';
import * as sinon from 'sinon';
import { BasicTracerProvider, Span } from '../src';

describe('BasicTracerProvider', () => {
beforeEach(() => {
context.disable();
});

afterEach(() => {
process.removeAllListeners('SIGTERM');
});

describe('constructor', () => {
it('should construct an instance without any options', () => {
const provider = new BasicTracerProvider();
Expand Down Expand Up @@ -381,4 +386,36 @@ describe('BasicTracerProvider', () => {
assert.ok(tracerProvider.resource instanceof Resource);
});
});

describe('.shutdown()', () => {
it('should trigger shutdown when SIGTERM is recieved', () => {
const tracerProvider = new BasicTracerProvider();
const sandbox = sinon.createSandbox();
const shutdownStub = sandbox.stub(
tracerProvider.getActiveSpanProcessor(),
'shutdown'
);
process.once('SIGTERM', () => {
sinon.assert.calledOnce(shutdownStub);
sandbox.restore();
});
process.kill(process.pid, 'SIGTERM');
});

it('should not trigger shutdown if graceful shutdown is turned off', () => {
const tracerProvider = new BasicTracerProvider({
gracefulShutdown: false,
});
const sandbox = sinon.createSandbox();
const shutdownStub = sandbox.stub(
tracerProvider.getActiveSpanProcessor(),
'shutdown'
);
process.once('SIGTERM', () => {
sinon.assert.notCalled(shutdownStub);
sandbox.restore();
});
process.kill(process.pid, 'SIGTERM');
});
});
});
Loading