Adding (optional) async object lifecycle events + misc (#9)

* tidy up notaion per comment from ofrobots * tidy up wording * adding some text on async object lifetime events * added draft lifecycle events
mike-kaufman · Feb 21, 2018 · 4b91f1b · 4b91f1b
1 parent cd26789
commit 4b91f1b
Showing 1 changed file with 144 additions and 81 deletions.
diff --git a/Async-Context-Definitions.md b/Async-Context-Definitions.md
@@ -62,26 +62,25 @@ we begin by defining an _asynchronous function context_ (or context) as a
 unique identifier. We only require that fresh instances of these 
 values can be generated on demand and compared for equality. In practice 
 monotonically increasing integer values provide a suitable representation.
-For a given function _f_ we define the asynchronous context representation 
-of _f_ in context _i_ as _f<sub>i</sub>_. 
 
 Our definitions of asynchronous executions are based on four 
 binary relations over the executions of logically asynchronous JavaScript 
 functions:
- - **execution** -- when a function _f_ is executed we create a unique 
- fresh context for it _c_ and use this as the `execution` context for 
- asynchronous events that happen during the execution of _f_.
- - **link** -- when the execution of function _f_ in context _i_ stores a 
- second function _g_ in context _j_ for later asynchronous execution we say 
- _f<sub>i</sub>_ `links` _g<sub>j</sub>_. 
- - **causal** -- when the execution of a function _f_ in context _i_ is 
- logically responsible (according to the `host` API) 
- for causing the execution of a previously **linked** _g_ from context _j_ 
- we say _f<sub>i</sub>_ `causes` _g<sub>j</sub>_. 
- - **happens before** -- when a function _f_ with execution context _i_ 
- is asynchronously executed before a second function _g_ with execution 
- context _j_ then _i_ < _j_ and we say _f<sub>i</sub>_ `happens before` 
- _g<sub>j</sub>_.
+ - **execution** -- when a function _f_ is executed as an asynchronous 
+ invocation we create a unique fresh context for it _c_ and use this as the 
+ `execution` context for asynchronous events that happen during the execution 
+ of _f_.
+ - **link** -- when the asynchronous execution of function _f_ in context _i_ 
+ stores a second function _g_ in context _j_ for later asynchronous execution 
+ we say _f_ in context _i_ `links` _g_ with context _j_. 
+ - **causal** -- when the asynchronous execution of a function _f_ in context 
+ _i_ is logically responsible (according to the `host` API) for causing the 
+ asynchronous execution of a previously **linked** _g_ from context _j_ we say 
+ _f_ in context _i_ `causes` _g_ with context _j_. 
+ - **happens before** -- when the asynchronous function _f_ with execution 
+ context _i_ is asynchronously executed before a second asynchronous function 
+ _g_ with execution context _j_ then _i_ < _j_ and we say _f_ in context _i>_
+ `happens before` _g_ in context _j_.
 
 We define the following module code that provides the required functions to
 explicitly mark API's that expose asynchronous behavior from `host` code to 
@@ -251,7 +250,8 @@ See a visualization of the above event stream [here](https://mike-kaufman.github
 ### Promise API
 Similarly we can provide a basic promise API that supports asynchronous 
 context tracking by modifying the real promise implementation as follows: 
-**TODO** this is super rough.
+
+**TODO** this needs to be checked carefully.
 
 ```js
 function then(onFulfilled, onRejected) {
@@ -403,60 +403,114 @@ and the top of the desired long call-stack.
 
 **TODO** add an example with sample code etc.
 
-## Asynchronous Operation Metadata
-In the previous sections we focused solely on tracking and emitting 
-information on the structure of the asynchronous call graph. However, most 
-applications, including our samples above, are interested in more than just the raw structure of this graph. While we cannot, and would not want to, identify 
-all possible data that could be needed and write it out to our log we can 
-(1) select core of commonly useful information to include and (2) add a timestamp 
-that is shared with user logging code to allow the correlation of custom user 
-log data with the asynchronous event data we write.
+## Asynchronous Object Lifecycle
+In addition to the asynchronous execution behavior of an application we are 
+often interested in the lifecyle events of the objects, such as promises or 
+event emitters, involved in this asynchronous execution. While this information 
+is critical for some applications it can be expensive to compute and is not 
+universally required. Thus, we seperate the tracking of this information out 
+into a seperate category and provide the following emit events which can be 
+optionally enabled.
+```js
+const idGen = 1;
+const idMap = new WeakMap();
+function createId(obj) {
+  idMap.set(obj, idGen++);
+}
 
-In our definitions all events are emitted with a timestamp generated by 
-`generateNextTime`. To allow correlation between our emitted events and any 
-user logging we expose this method to user code so they can include correlated 
-timestamps in their logging. 
+function getId(obj) {
+  return idMap.get(obj);    
+}
 
-The other core metadata we track split it into two classes `standard` and 
-`detailed`. The `standard` data is intended to include information that is 
-nearly universally useful and low cost to gather while the `detailed` class 
-is for less universaly relevant data or data that is expensive to capture.
- * Standard:
-   - Source/Line info for applicable events.
-   - **TODO** other info?
- * Detailed:
-   - Callstack info the applicable events.
-   - **TODO** other info?
+function createPromise(pobj) {
+  createId(pobj);
+  emit("createPromise", getId(pobj), generateNextTime());
+}
 
-## Enriched Terminology
-The definitions in the _Terminology_ section provide basic asynchronous 
-lifecycle events but do not capture many important features including, 
-canceled events or failed rejections and, in cases of asynchronous events that 
-depend on environmental interaction, what external events may be relevant.
+function resolvePromise(pobj) {
+  emit("resolvePromise", getId(pobj), generateNextTime());
+}
 
-To support scenarios that require this type of information we extend the vocabulary 
-of events recorded in the asynchronous execution trace with the following hooks:
+function rejectPromise(pobj, reason) {
+  emit("rejectPromise", getId(pobj), reason, generateNextTime());
+}
 
-```
-cancel(ctxf) {
-    emit("cancel", ctxf.linkCtx, generateNextTime());
+function disposePromise(pobj, unhandled) {
+  emit("disposePromise", getId(pobj), unhandled, generateNextTime());
 }
 
-externalCause(ctxf, data) {
-    emit("externalCause", ctxf.causeCtx, generateNextTime(), data);
+function createEmitter(e) {
+  createId(e);
+  emit("createEmitter", getId(e), generateNextTime());
+}
+
+function disposeEmitter(e) {
+  emit("disposeEmitter", getId(e), generateNextTime());
 }
+```
+
+We first need a way to track the identity of promise and event emitter objects. 
+As a reference implementation we can use a `WeakMap` and a mechanism for 
+generating fresh ids (a monotinically increasing counter). However, JavaScript 
+engines, with knowledge of underlying object representation and GC systems, can 
+provide optimized implementations of this function. For example in runtimes with 
+non-moving collectors the underlying pointer address of an object is a suitable 
+and efficiently computable identifier.
+
+A `create promise` event is emitted when a promise object is constructed, 
+`Promise.reject`, `Promise.resolve`, or `new Promise`. This starts the 
+tracking of the lifecycle of the newly created promise.
+
+A `resolve promise` event is emitted when a promise object is transitioned 
+from the pending state into the resolved state. For `Promise.resolve` this 
+happens immediately after the `create promise` event.
+
+A `rejected promise` event is emitted when a promise object is transitioned 
+from the pending state into the resolved state. For `Promise.reject` this 
+happens immediately after the `create promise` event. Since a promise can 
+reject for several reasons, created as a reject, explicitly rejected, or 
+uncaught exception, this message also includes the cause information.
+
+A `dispose promise` event is emitted when a promise object will never be 
+involved in any other asynchronous execution. If the promise has an unhandled 
+reject then we include this unhandled rejection information. This event will 
+always be emitted before the underlying object is collected but, if the 
+runtime/compiler can determine in advance that the promise is semantically 
+dead from an asynch viewpoint then this message can be emitted earlier.
+
+**Note:** This would alter the `unhandled reject` [semantics](https://nodejs.org/api/process.html#process_event_unhandledrejection) 
+which specify a turn of the event loop. 
+
+A `create emitter` event is emitted when an asynchronous event emitter 
+source, http, file, etc. object that can have listners attached to it is 
+created. This starts the tracking of the lifecycle of the newly created 
+emitter.
+
+A `dispose emitter` event is emitted when an emitter object will never be 
+involved in any other asynchronous execution. This event will 
+always be emitted before the underlying object is collected but, if the 
+runtime/compiler can determine in advance that the promise is semantically 
+dead from an asynch viewpoint then this message can be emitted earlier.
 
-failed(ctxf, reason) {
-    emit("fail", reason, generateNextTime());
+## Enriched Terminology
+The definitions in the previous sections provide basic asynchronous execution 
+and lifecycle events but do not capture many important features including, 
+canceled events or, in cases of asynchronous events that 
+depend on environmental interaction, what external events may be relevant.
+
+To support scenarios that require this type of information we extend the vocabulary 
+of events recorded in the asynchronous execution trace with the following hooks:
+```js
+function cancel(ctxf) {
+    emit("cancel", ctxf.linkCtx, ctxf.causeCtx || -1, generateNextTime());
 }
 
-rejected(ctxf, reason, isException) {
-    ctxf.rejection = currentExecutingContext;
-    emit("rejected", reason, isException, generateNextTime());
+function externalCause(ctxf, data) {
+    emit("externalCause", ctxf.causeCtx, generateNextTime(), data);
 }
 
-unhandledReject(ctxf) {
-    emit("rejected", ctxf.rejection, generateNextTime());
+function failedCallback(currentExecutingContext) {
+    emit("failedCallback",  currentExecutingContext, generateNextTime());
 }
 ```
 
@@ -473,29 +527,38 @@ by registering them **and** by external data arriving to trigger their
 execution. These entries provide additional context on this external data, 
 included in the `data` component of the message.
 
-A `failed` event is emitted when a callback throws an uncaught exception 
-during its execution which results in an unhandled exception.
-
-A `rejected` event is emitted when an [asynchronously executed]? promise 
-rejects and includes if the rejection was the result of an exception or 
-explicit reject.
-
-An `unhandled rejection` event is emitted when an [asynchronously executed]? 
-promise rejects and is not handled within a turn of the event loop as per Node [semantics](https://nodejs.org/api/process.html#process_event_unhandledrejection).
-
-**TODO** I think promise handling described here will miss something like a 
-naked `Promise.reject("nothing")` call which is an unhandled rejection but 
-just doesn't involve any asynchronous behavior. So, we may want to include 
-synchronous promise creation/execution somehow if this is important to us. 
-I think this just gets kicked back to sequential execution analysis but 
-may be worth discussing.
-
-Using the these definitions we can define the following states of an 
-asynchronous execution:
- * A (sub)tree is in `active asynchronous execution` when there exists a child 
- node, link or causal, that has not completed.
- * A (sub)tree has `retired asynchronous execution` when all child nodes, 
- link of causal, are in the completed state.
+A `failedCallback` event is emitted when a callback throws an uncaught 
+exception during its execution which results in an unhandled exception.
+
+## Asynchronous Operation Metadata
+In the previous sections we focused solely on tracking and emitting 
+information on the structure of the asynchronous call graph and asynchronous 
+object lifecycles. However, most 
+applications, including our samples above, are interested in more than just the 
+raw structure of this graph. While we cannot, and would not want to, identify 
+all possible data that could be needed and write it out to our log we can 
+(1) select core of commonly useful information to include and (2) add a timestamp 
+that is shared with user logging code to allow the correlation of custom user 
+log data with the asynchronous event data we write.
+
+In our definitions all events are emitted with a timestamp generated by 
+`generateNextTime`. To allow correlation between our emitted events and any 
+user logging we expose this method to user code so they can include correlated 
+timestamps in their logging. 
+
+The other core metadata we track split it into two classes `standard` and 
+`detailed`. The `standard` data is intended to include information that is 
+nearly universally useful and low cost to gather while the `detailed` class 
+is for less universally relevant data, while the `expensive` class is for 
+data that is expensive to capture.
+ * Standard:
+   - Source/Line info for applicable events.
+   - **TODO** other info?
+ * Detailed:
+   - **TODO** other info?
+ * Expensive:
+   - Callstack info the applicable events.
+   - **TODO** other info?
 
 ## 5. Use Cases
 Use cases for Async Context can be broken into two categories, **post-mortem** and