Skip to content

Commit

Permalink
Adding (optional) async object lifecycle events + misc (#9)
Browse files Browse the repository at this point in the history
* tidy up notaion per comment from ofrobots

* tidy up wording

* adding some text on async object lifetime events

* added draft lifecycle events
  • Loading branch information
mrkmarron authored and mike-kaufman committed Feb 21, 2018
1 parent cd26789 commit 4b91f1b
Showing 1 changed file with 144 additions and 81 deletions.
225 changes: 144 additions & 81 deletions Async-Context-Definitions.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,26 +62,25 @@ we begin by defining an _asynchronous function context_ (or context) as a
unique identifier. We only require that fresh instances of these
values can be generated on demand and compared for equality. In practice
monotonically increasing integer values provide a suitable representation.
For a given function _f_ we define the asynchronous context representation
of _f_ in context _i_ as _f<sub>i</sub>_.

Our definitions of asynchronous executions are based on four
binary relations over the executions of logically asynchronous JavaScript
functions:
- **execution** -- when a function _f_ is executed we create a unique
fresh context for it _c_ and use this as the `execution` context for
asynchronous events that happen during the execution of _f_.
- **link** -- when the execution of function _f_ in context _i_ stores a
second function _g_ in context _j_ for later asynchronous execution we say
_f<sub>i</sub>_ `links` _g<sub>j</sub>_.
- **causal** -- when the execution of a function _f_ in context _i_ is
logically responsible (according to the `host` API)
for causing the execution of a previously **linked** _g_ from context _j_
we say _f<sub>i</sub>_ `causes` _g<sub>j</sub>_.
- **happens before** -- when a function _f_ with execution context _i_
is asynchronously executed before a second function _g_ with execution
context _j_ then _i_ < _j_ and we say _f<sub>i</sub>_ `happens before`
_g<sub>j</sub>_.
- **execution** -- when a function _f_ is executed as an asynchronous
invocation we create a unique fresh context for it _c_ and use this as the
`execution` context for asynchronous events that happen during the execution
of _f_.
- **link** -- when the asynchronous execution of function _f_ in context _i_
stores a second function _g_ in context _j_ for later asynchronous execution
we say _f_ in context _i_ `links` _g_ with context _j_.
- **causal** -- when the asynchronous execution of a function _f_ in context
_i_ is logically responsible (according to the `host` API) for causing the
asynchronous execution of a previously **linked** _g_ from context _j_ we say
_f_ in context _i_ `causes` _g_ with context _j_.
- **happens before** -- when the asynchronous function _f_ with execution
context _i_ is asynchronously executed before a second asynchronous function
_g_ with execution context _j_ then _i_ < _j_ and we say _f_ in context _i>_
`happens before` _g_ in context _j_.

We define the following module code that provides the required functions to
explicitly mark API's that expose asynchronous behavior from `host` code to
Expand Down Expand Up @@ -251,7 +250,8 @@ See a visualization of the above event stream [here](https://mike-kaufman.github
### Promise API
Similarly we can provide a basic promise API that supports asynchronous
context tracking by modifying the real promise implementation as follows:
**TODO** this is super rough.

**TODO** this needs to be checked carefully.

```js
function then(onFulfilled, onRejected) {
Expand Down Expand Up @@ -403,60 +403,114 @@ and the top of the desired long call-stack.

**TODO** add an example with sample code etc.

## Asynchronous Operation Metadata
In the previous sections we focused solely on tracking and emitting
information on the structure of the asynchronous call graph. However, most
applications, including our samples above, are interested in more than just the raw structure of this graph. While we cannot, and would not want to, identify
all possible data that could be needed and write it out to our log we can
(1) select core of commonly useful information to include and (2) add a timestamp
that is shared with user logging code to allow the correlation of custom user
log data with the asynchronous event data we write.
## Asynchronous Object Lifecycle
In addition to the asynchronous execution behavior of an application we are
often interested in the lifecyle events of the objects, such as promises or
event emitters, involved in this asynchronous execution. While this information
is critical for some applications it can be expensive to compute and is not
universally required. Thus, we seperate the tracking of this information out
into a seperate category and provide the following emit events which can be
optionally enabled.
```js
const idGen = 1;
const idMap = new WeakMap();
function createId(obj) {
idMap.set(obj, idGen++);
}

In our definitions all events are emitted with a timestamp generated by
`generateNextTime`. To allow correlation between our emitted events and any
user logging we expose this method to user code so they can include correlated
timestamps in their logging.
function getId(obj) {
return idMap.get(obj);
}

The other core metadata we track split it into two classes `standard` and
`detailed`. The `standard` data is intended to include information that is
nearly universally useful and low cost to gather while the `detailed` class
is for less universaly relevant data or data that is expensive to capture.
* Standard:
- Source/Line info for applicable events.
- **TODO** other info?
* Detailed:
- Callstack info the applicable events.
- **TODO** other info?
function createPromise(pobj) {
createId(pobj);
emit("createPromise", getId(pobj), generateNextTime());
}

## Enriched Terminology
The definitions in the _Terminology_ section provide basic asynchronous
lifecycle events but do not capture many important features including,
canceled events or failed rejections and, in cases of asynchronous events that
depend on environmental interaction, what external events may be relevant.
function resolvePromise(pobj) {
emit("resolvePromise", getId(pobj), generateNextTime());
}

To support scenarios that require this type of information we extend the vocabulary
of events recorded in the asynchronous execution trace with the following hooks:
function rejectPromise(pobj, reason) {
emit("rejectPromise", getId(pobj), reason, generateNextTime());
}

```
cancel(ctxf) {
emit("cancel", ctxf.linkCtx, generateNextTime());
function disposePromise(pobj, unhandled) {
emit("disposePromise", getId(pobj), unhandled, generateNextTime());
}

externalCause(ctxf, data) {
emit("externalCause", ctxf.causeCtx, generateNextTime(), data);
function createEmitter(e) {
createId(e);
emit("createEmitter", getId(e), generateNextTime());
}

function disposeEmitter(e) {
emit("disposeEmitter", getId(e), generateNextTime());
}
```

We first need a way to track the identity of promise and event emitter objects.
As a reference implementation we can use a `WeakMap` and a mechanism for
generating fresh ids (a monotinically increasing counter). However, JavaScript
engines, with knowledge of underlying object representation and GC systems, can
provide optimized implementations of this function. For example in runtimes with
non-moving collectors the underlying pointer address of an object is a suitable
and efficiently computable identifier.

A `create promise` event is emitted when a promise object is constructed,
`Promise.reject`, `Promise.resolve`, or `new Promise`. This starts the
tracking of the lifecycle of the newly created promise.

A `resolve promise` event is emitted when a promise object is transitioned
from the pending state into the resolved state. For `Promise.resolve` this
happens immediately after the `create promise` event.

A `rejected promise` event is emitted when a promise object is transitioned
from the pending state into the resolved state. For `Promise.reject` this
happens immediately after the `create promise` event. Since a promise can
reject for several reasons, created as a reject, explicitly rejected, or
uncaught exception, this message also includes the cause information.

A `dispose promise` event is emitted when a promise object will never be
involved in any other asynchronous execution. If the promise has an unhandled
reject then we include this unhandled rejection information. This event will
always be emitted before the underlying object is collected but, if the
runtime/compiler can determine in advance that the promise is semantically
dead from an asynch viewpoint then this message can be emitted earlier.

**Note:** This would alter the `unhandled reject` [semantics](https://nodejs.org/api/process.html#process_event_unhandledrejection)
which specify a turn of the event loop.

A `create emitter` event is emitted when an asynchronous event emitter
source, http, file, etc. object that can have listners attached to it is
created. This starts the tracking of the lifecycle of the newly created
emitter.

A `dispose emitter` event is emitted when an emitter object will never be
involved in any other asynchronous execution. This event will
always be emitted before the underlying object is collected but, if the
runtime/compiler can determine in advance that the promise is semantically
dead from an asynch viewpoint then this message can be emitted earlier.

failed(ctxf, reason) {
emit("fail", reason, generateNextTime());
## Enriched Terminology
The definitions in the previous sections provide basic asynchronous execution
and lifecycle events but do not capture many important features including,
canceled events or, in cases of asynchronous events that
depend on environmental interaction, what external events may be relevant.

To support scenarios that require this type of information we extend the vocabulary
of events recorded in the asynchronous execution trace with the following hooks:
```js
function cancel(ctxf) {
emit("cancel", ctxf.linkCtx, ctxf.causeCtx || -1, generateNextTime());
}

rejected(ctxf, reason, isException) {
ctxf.rejection = currentExecutingContext;
emit("rejected", reason, isException, generateNextTime());
function externalCause(ctxf, data) {
emit("externalCause", ctxf.causeCtx, generateNextTime(), data);
}

unhandledReject(ctxf) {
emit("rejected", ctxf.rejection, generateNextTime());
function failedCallback(currentExecutingContext) {
emit("failedCallback", currentExecutingContext, generateNextTime());
}
```

Expand All @@ -473,29 +527,38 @@ by registering them **and** by external data arriving to trigger their
execution. These entries provide additional context on this external data,
included in the `data` component of the message.

A `failed` event is emitted when a callback throws an uncaught exception
during its execution which results in an unhandled exception.

A `rejected` event is emitted when an [asynchronously executed]? promise
rejects and includes if the rejection was the result of an exception or
explicit reject.

An `unhandled rejection` event is emitted when an [asynchronously executed]?
promise rejects and is not handled within a turn of the event loop as per Node [semantics](https://nodejs.org/api/process.html#process_event_unhandledrejection).

**TODO** I think promise handling described here will miss something like a
naked `Promise.reject("nothing")` call which is an unhandled rejection but
just doesn't involve any asynchronous behavior. So, we may want to include
synchronous promise creation/execution somehow if this is important to us.
I think this just gets kicked back to sequential execution analysis but
may be worth discussing.

Using the these definitions we can define the following states of an
asynchronous execution:
* A (sub)tree is in `active asynchronous execution` when there exists a child
node, link or causal, that has not completed.
* A (sub)tree has `retired asynchronous execution` when all child nodes,
link of causal, are in the completed state.
A `failedCallback` event is emitted when a callback throws an uncaught
exception during its execution which results in an unhandled exception.

## Asynchronous Operation Metadata
In the previous sections we focused solely on tracking and emitting
information on the structure of the asynchronous call graph and asynchronous
object lifecycles. However, most
applications, including our samples above, are interested in more than just the
raw structure of this graph. While we cannot, and would not want to, identify
all possible data that could be needed and write it out to our log we can
(1) select core of commonly useful information to include and (2) add a timestamp
that is shared with user logging code to allow the correlation of custom user
log data with the asynchronous event data we write.

In our definitions all events are emitted with a timestamp generated by
`generateNextTime`. To allow correlation between our emitted events and any
user logging we expose this method to user code so they can include correlated
timestamps in their logging.

The other core metadata we track split it into two classes `standard` and
`detailed`. The `standard` data is intended to include information that is
nearly universally useful and low cost to gather while the `detailed` class
is for less universally relevant data, while the `expensive` class is for
data that is expensive to capture.
* Standard:
- Source/Line info for applicable events.
- **TODO** other info?
* Detailed:
- **TODO** other info?
* Expensive:
- Callstack info the applicable events.
- **TODO** other info?

## 5. Use Cases
Use cases for Async Context can be broken into two categories, **post-mortem** and
Expand Down

0 comments on commit 4b91f1b

Please sign in to comment.