Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

timers: run nextTicks after each immediate and timer #22842

Conversation

apapirovski
Copy link
Member

@apapirovski apapirovski commented Sep 13, 2018

In order to better match browser behaviour, run nextTicks (and subsequently the microtask queue) after each individual Timer and Immediate, rather than after the whole list is processed. The current behaviour is somewhat of a performance micro-optimization and also partly dictated by how timer handles were implemented.

This will have a significant impact on how promises are resolved/rejected (due to changes to the microtask queue timing) which is IMO a reasonably important change.

@Fishrock123 already started some work on this in #22316 but I spun this off into a separate PR as I had good understanding of what needed to happen to actually pass the tests.

I still likely need to update the docs and write a few tests. This is also definitely semver-major. This will also need a CitGM run.

Fixes: #22257

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • tests and/or benchmarks are included
  • commit message follows commit guidelines

@nodejs-github-bot nodejs-github-bot added the timers Issues and PRs related to the timers subsystem / setImmediate, setInterval, setTimeout. label Sep 13, 2018
@apapirovski apapirovski added the semver-major PRs that contain breaking changes and should be released in the next major version. label Sep 13, 2018
@apapirovski
Copy link
Member Author

apapirovski commented Sep 13, 2018

Ok, looks like test/parallel/test-domain-with-abort-on-uncaught-exception.js might be made flaky by this change. I'll have to dig into that test. Doesn't fail for me locally and it succeeded in the full CI hence the flaky assumption.

Stress test: https://ci.nodejs.org/job/node-stress-single-test/2014/ and against master: https://ci.nodejs.org/job/node-stress-single-test/2015/

Edit: Also parallel/test-child-process-ipc-next-tick might be made flaky by this. I'm guessing that these are both related to the change in timers.

@mscdex
Copy link
Contributor

mscdex commented Sep 13, 2018

Can we get some benchmarks to show just what kind of performance hit we should expect from this?

@apapirovski
Copy link
Member Author

@mscdex Yup, I'm hoping to first figure out if we can get all the tests to pass and CitGM to work before I go down the road of trying to figure out performance and maybe optimize that.

@joaocgreis
Copy link
Member

@apapirovski had to abort the CitGM run to restart Jenkins, two workers didn't finish. Can you restart it if needed? I started the master run after rebooting, link above is correct.

lib/timers.js Outdated
@@ -264,7 +264,7 @@ function listOnTimeout(list, now) {

debug('timeout callback %d', msecs);

var diff, timer;
var diff, ran, timer;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ran what? :)

Copy link
Member Author

@apapirovski apapirovski Sep 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kno, I kno 😆 I'll make it more descriptive. But to answer the question: an iteration of the loop right after it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

= false?

@apapirovski
Copy link
Member Author

apapirovski commented Sep 14, 2018

There are currently 3 test failures. They are:

  1. test/sequential/test-http-keepalive-maxsockets.js (see test: move test-http-keepalive-maxsockets to sequential #16777)
    This is a flaky test that was moved to sequential. I'll dig in but I assume this test has a weird interaction with when nextTicks are executed. I think this test is flaky and needs fixing, rather than it highlighting anything wrong with this PR.

  2. test/parallel/test-child-process-ipc-next-tick.js
    This one might be showcasing a real bug. I will need to confirm.

  3. test/parallel/test-domain-with-abort-on-uncaught-exception.js
    I don't even know where to start but again, seems like potentially a real bug.

@apapirovski apapirovski force-pushed the next-ticks-after-each-timer-and-immediate branch from e7b20ce to 82e1515 Compare September 14, 2018 02:23
@apapirovski
Copy link
Member Author

apapirovski commented Sep 14, 2018

CI: https://ci.nodejs.org/job/node-test-pull-request/17179/
CitGM: https://ci.nodejs.org/view/Node.js-citgm/job/citgm-smoker/1544/

I believe that all 3 test failures above were related to incorrect error resolution inside processImmediates. I had to make some changes for the way the new nextTicks are run.

@starkwang
Copy link
Contributor

Should we add some documents about this behavior change?

@apapirovski
Copy link
Member Author

apapirovski commented Sep 14, 2018

I think we're in the clear on tests now so here's benchmarks:

https://ci.nodejs.org/view/Node.js%20benchmark/job/benchmark-node-micro-benchmarks/238/

I will be working on new tests for this behaviour and documentation later today or tomorrow. I'll also follow up with perf improvements as necessary.

Results:

 timers/immediate.js type='breadth1' n=5000000                         *     -3.25 %       ±2.61%  ±3.44%  ±4.41%
 timers/immediate.js type='breadth4' n=5000000                        **     -2.74 %       ±1.74%  ±2.30%  ±2.95%
 timers/immediate.js type='breadth' n=5000000                        ***     -7.91 %       ±2.09%  ±2.75%  ±3.53%
 timers/immediate.js type='clear' n=5000000                           **     -2.48 %       ±1.64%  ±2.17%  ±2.78%
 timers/immediate.js type='depth1' n=5000000                         ***     -3.80 %       ±1.39%  ±1.84%  ±2.36%
 timers/immediate.js type='depth' n=5000000                            *     -1.19 %       ±1.18%  ±1.55%  ±1.99%
 timers/set-immediate-breadth-args.js n=5000000                      ***     -5.80 %       ±1.83%  ±2.41%  ±3.09%
 timers/set-immediate-breadth.js n=10000000                          ***     -6.23 %       ±1.88%  ±2.48%  ±3.18%
 timers/set-immediate-depth-args.js n=5000000                        ***     -2.43 %       ±0.90%  ±1.19%  ±1.53%
 timers/timers-breadth.js n=5000000                                  ***     -7.07 %       ±3.33%  ±4.39%  ±5.64%
 timers/timers-cancel-pooled.js n=5000000                                    -5.77 %       ±7.09%  ±9.35% ±12.01%
 timers/timers-cancel-unpooled.js direction='end' n=1000000                   9.24 %      ±10.52% ±13.87% ±17.81%
 timers/timers-cancel-unpooled.js direction='start' n=1000000                 2.14 %       ±5.43%  ±7.16%  ±9.19%
 timers/timers-depth.js n=1000                                         *     -0.06 %       ±0.06%  ±0.08%  ±0.10%
 timers/timers-insert-pooled.js n=5000000                                    -1.71 %       ±2.13%  ±2.81%  ±3.61%
 timers/timers-insert-unpooled.js direction='end' n=1000000                  -0.99 %       ±1.39%  ±1.84%  ±2.36%
 timers/timers-insert-unpooled.js direction='start' n=1000000                -2.18 %       ±2.78%  ±3.66%  ±4.70%
 timers/timers-timeout-pooled.js n=10000000                          ***    -38.68 %       ±2.94%  ±3.88%  ±4.99%
 timers/timers-timeout-unpooled.js n=1000000                         ***    -37.78 %       ±2.35%  ±3.11%  ±3.99%

Given the perf improvements here #20555, I think we can live with the timers-timeout regressions but I'll be doing some work to minimize it.

let refCount = 0;
if (queue !== outstandingQueue) {
queue.head = queue.tail = null;
immediateInfo[kHasOutstanding] = 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could leave a comment here about how this = null thing works? it's... pretty difficult to read/understand.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should disallow that in the lint rules. please split it into two.

lib/timers.js Outdated
if (ran)
runNextTicks();
else
ran = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto on comments here and above where we use the same pattern, easy to misunderstand

Copy link
Contributor

@Fishrock123 Fishrock123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems fine to me, but it certainly confusing, implementation-wise.

We should probably CITGM this, it seems slightly ecosystem-risky to me? Don't think it's necessarily major by default though.

Edit: Re - benchmarks; maybe we can do some extra optimization there. I'll try to take profiles.

@Fishrock123
Copy link
Contributor

oof, this change is.. kinda expensive.

Here are the numbers I got for timers/timers-timeout-pooled.js n=10000000

  • master: 14,202,832.177390624
  • this patch: 5,709,151.565629168

Profiles:

@apapirovski
Copy link
Member Author

^ that seems worse than the benchmark CI run. But it's not generally surprising, it now has to run nextTicks on each timer which necessitates crossing over to C++ for each timer. The question is how frequent of an occurrence this is and whether this is a sacrifice we're willing to make for correctness.

@mscdex
Copy link
Contributor

mscdex commented Sep 18, 2018

Perhaps running these changes against some real-world scenarios (e.g. acmeair, etc.) might help answer some of this?

@Fishrock123
Copy link
Contributor

The question is how frequent of an occurrence this is

Promise-based code is the most likely candidate here. the question is, do people really do lots of async promise stuff in timeouts? I don't really know but I do feel like I usually don't do much async stuff in timeouts.

@devsnek devsnek mentioned this pull request Sep 19, 2018
3 tasks
@apapirovski
Copy link
Member Author

I'm planning to revisit this again over the weekend. Hopefully tidy up some of the perf and get some clearer practical results.

@apapirovski apapirovski force-pushed the next-ticks-after-each-timer-and-immediate branch from 82e1515 to ff24a20 Compare October 13, 2018 17:25
@apapirovski
Copy link
Member Author

@apapirovski apapirovski force-pushed the next-ticks-after-each-timer-and-immediate branch from 88792fe to b6e4f68 Compare October 13, 2018 18:11
@sam-github
Copy link
Contributor

I think this is a good idea, it simplifies the API and makes it much more consistent.

nodejs/nodejs.org#1804 for example describes some confusion about when the tick queue is cleared, and some attempts to make the docs describe the behaviour.

Does this PR clear the tick queue only after timer callbacks, or does it do so after immediates are run, too? Both would be useful, IMO.

@Trott
Copy link
Member

Trott commented Oct 18, 2018

Landed in e7af983

@Trott Trott closed this Oct 18, 2018
refack pushed a commit that referenced this pull request Oct 18, 2018
Check that $TRAVIS_COMMIT_RANGE is set before trying to lint commit
messages in Travis CI.

Refs: #23572 (comment)
Refs: #22842 (comment)

PR-URL: #23725
Reviewed-By: Anatoli Papirovski <apapirovski@mac.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Matheus Marchini <mat@mmarchini.me>
addaleax pushed a commit that referenced this pull request Oct 20, 2018
Check that $TRAVIS_COMMIT_RANGE is set before trying to lint commit
messages in Travis CI.

Refs: #23572 (comment)
Refs: #22842 (comment)

PR-URL: #23725
Reviewed-By: Anatoli Papirovski <apapirovski@mac.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Matheus Marchini <mat@mmarchini.me>
@apapirovski apapirovski mentioned this pull request Oct 21, 2018
4 tasks
jasnell pushed a commit that referenced this pull request Oct 21, 2018
In order to better match the browser behaviour, run nextTicks (and
subsequently the microtask queue) after each individual Timer and
Immediate, rather than after the whole list is processed. The
current behaviour is somewhat of a performance micro-optimization
and also partly dictated by how timer handles were implemented.

PR-URL: #22842
Fixes: #22257
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Gus Caplan <me@gus.host>
Reviewed-By: Jeremiah Senkpiel <fishrock123@rocketmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
jasnell pushed a commit that referenced this pull request Oct 21, 2018
Check that $TRAVIS_COMMIT_RANGE is set before trying to lint commit
messages in Travis CI.

Refs: #23572 (comment)
Refs: #22842 (comment)

PR-URL: #23725
Reviewed-By: Anatoli Papirovski <apapirovski@mac.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Matheus Marchini <mat@mmarchini.me>
MylesBorins pushed a commit that referenced this pull request Oct 30, 2018
Check that $TRAVIS_COMMIT_RANGE is set before trying to lint commit
messages in Travis CI.

Refs: #23572 (comment)
Refs: #22842 (comment)

PR-URL: #23725
Reviewed-By: Anatoli Papirovski <apapirovski@mac.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Matheus Marchini <mat@mmarchini.me>
MylesBorins pushed a commit that referenced this pull request Nov 26, 2018
Check that $TRAVIS_COMMIT_RANGE is set before trying to lint commit
messages in Travis CI.

Refs: #23572 (comment)
Refs: #22842 (comment)

PR-URL: #23725
Reviewed-By: Anatoli Papirovski <apapirovski@mac.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Matheus Marchini <mat@mmarchini.me>
rvagg pushed a commit that referenced this pull request Nov 28, 2018
Check that $TRAVIS_COMMIT_RANGE is set before trying to lint commit
messages in Travis CI.

Refs: #23572 (comment)
Refs: #22842 (comment)

PR-URL: #23725
Reviewed-By: Anatoli Papirovski <apapirovski@mac.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Matheus Marchini <mat@mmarchini.me>
rvagg pushed a commit that referenced this pull request Nov 28, 2018
Check that $TRAVIS_COMMIT_RANGE is set before trying to lint commit
messages in Travis CI.

Refs: #23572 (comment)
Refs: #22842 (comment)

PR-URL: #23725
Reviewed-By: Anatoli Papirovski <apapirovski@mac.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Matheus Marchini <mat@mmarchini.me>
MylesBorins pushed a commit that referenced this pull request Nov 29, 2018
Check that $TRAVIS_COMMIT_RANGE is set before trying to lint commit
messages in Travis CI.

Refs: #23572 (comment)
Refs: #22842 (comment)

PR-URL: #23725
Reviewed-By: Anatoli Papirovski <apapirovski@mac.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Matheus Marchini <mat@mmarchini.me>
wmfgerrit pushed a commit to wikimedia/mediawiki-services-parsoid that referenced this pull request Nov 29, 2018
Node.js v11.x introduced a difference with how timers are run that
results in some about ids in the blacklisted tests being off.  Needs
investigation but, regardless, should be fixed by T206222

See nodejs/node#22842

Bug: T206222
Change-Id: I9f0f906390d763ea52c61c9ae586da13172d687e
@OmarDarwish
Copy link

I tried to clarify how this behaves before this change landed in v11 nodejs/nodejs.org#1804

bughit referenced this pull request in nodejs/nodejs.org Mar 18, 2019
It was a little unclear for me when the nextTickQueue is processed (see this StackOverflow question). After some experimentation and more googling, re-write to make it more clear.
bughit added a commit to bughit/nodejs.org that referenced this pull request Mar 18, 2019
Deduplication section appears to be describing how things worked prior to nodejs/node#22842, so is no longer correct. The output in 11.12 is 

```
setImmediate 1
process.nextTick a
process.nextTick b
setImmediate 2
process.nextTick a
process.nextTick b
```
@ghost
Copy link

ghost commented Mar 19, 2019

@apapirovski:Would you mind reviewing the article nodejs/nodejs.org#2158, if you are free

ghost pushed a commit to nodejs/nodejs.org that referenced this pull request Apr 4, 2019
…-nexttick.md' (#2158)

Deduplication section appears to be describing how things worked prior to nodejs/node#22842, so is no longer correct.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
semver-major PRs that contain breaking changes and should be released in the next major version. timers Issues and PRs related to the timers subsystem / setImmediate, setInterval, setTimeout.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

MacroTask and MicroTask execution order