-
Notifications
You must be signed in to change notification settings - Fork 30.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce likelihood of race conditions on keep-alive timeout calculatio… #52653
Conversation
Review requested:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
ping @mweberxyz |
lib/_http_agent.js
Outdated
// Let the timer expires before the announced timeout to reduce | ||
// the likelihood of ECONNRESET errors | ||
let serverHintTimeout = ( NumberParseInt(hint) * 1000 ) - 1000; | ||
serverHintTimeout = serverHintTimeout > 0 ? serverHintTimeout : 0; | ||
|
||
if (serverHintTimeout < agentTimeout) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This conditional needs to be fixed - agentTimeout defaults to 0, so the serverHintTimeout is never being set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case the socket shouldn't be reused at all. Is setting socket.setTimeout(0) the right way to do it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Take a look at socket docs - socket.setTimeout(0) doesn't mean immediate, it means never.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I have limited understanding of the network internals. In undici ( /~https://github.com/nodejs/undici/pull/291/files ) if the keepAliveTimeout goes down to 0 they flag the connection as reset:
if (!keepAliveTimeout || keepAliveTimeout < 1e3) {
client[kReset] = true
}
I don't know how to achieve the same result here. Maybe:
socket.setKeepAlive(false);
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also have a limited understanding of network internals, but I think:
if (!agentTimeout || serverHintTimeout < agentTimeout) {
is what you want.
socket.setTimeout(server.keepAliveTimeout); | ||
// Increase the internal timeout wrt the advertised value to reduce likeliwood of ECONNRESET errors | ||
// due to race conditions between the client and server timeout calculation | ||
socket.setTimeout(server.keepAliveTimeout + 1000); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this server change is necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be the cause of all the test failures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fix protects clients that are using the hint timeout as is (current node impl) without adjusting it for network (or cpu load) delays
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If server.keepAliveTimeout is set to 0 (never time out) this will change it to 1 second.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If server.keepAliveTimeout == 0 then the if condition line1012 is false. The timeout on the socket is set only if server.keepAliveTimeout != 0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're correct, sorry I missed that.
lib/_http_agent.js
Outdated
const serverHintTimeout = NumberParseInt(hint) * 1000; | ||
// Let the timer expires before the announced timeout to reduce | ||
// the likelihood of ECONNRESET errors | ||
let serverHintTimeout = ( NumberParseInt(hint) * 1000 ) - 1000; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the case the server responds with a 1 second keepalive, this will set the disable the timeout because serverHintTimeout will be 0 -- maybe go with - 500
in place of - 1000
?
serverHintTimeout = serverHintTimeout > 0 ? serverHintTimeout : 0; | ||
if (serverHintTimeout === 0) { | ||
// cannot safely reuse the socket because the server timeout is too short | ||
canKeepSocketAlive = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have updated the PR to just skip keep alive if the timeout is too short. In my local build the test now pass.
This comment has been minimized.
This comment has been minimized.
8407fd7
to
bc34680
Compare
Found the same issue in dotnet -- they went with a 1 second offset as well, so that was a good choice. 👍 |
ba0d895
to
2124b55
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
reduce likelihood of race conditions on keep-alive timeout calculation between http1.1 servers and clients and honor server keep-alive timeout when agentTimeout is not set Fixes: nodejs#47130 Fixes: nodejs#52649
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me, it looks like this should be a testable behavior change.
I had a short look and I guess if test-http-client-keep-alive-hint.js
would be a better test it should actually fail now, because the case it tests now behaves quite different (there is no keep-alive because the hint is just 1 second).
I think a test somewhat along these lines would pass on main and fail on this branch:
'use strict';
const common = require('../common');
const assert = require('assert');
const http = require('http');
const server = http.createServer(
{ keepAliveTimeout: common.platformTimeout(60000) },
function(req, res) {
req.resume();
res.writeHead(200, { 'Connection': 'keep-alive', 'Keep-Alive': 'timeout=1' });
res.end('FOO');
}
);
server.listen(0, common.mustCall(() => {
let shouldStillBeAlive = true;
setTimeout(() => {
shouldStillBeAlive = false;
}, common.platformTimeout(500)); // 500ms buffer
const req = http.get({ port: server.address().port }, (res) => {
assert.strictEqual(res.statusCode, 200);
res.resume();
});
req.on('socket', (socket) => {
socket.on('close', common.mustCall(() => {
if (shouldStillBeAlive) {
assert.fail('socket prematurely closed');
}
server.close();
}));
});
}));
// This timer should never go off as the agent will parse the hint and terminate earlier
setTimeout(common.mustNotCall(), common.platformTimeout(3000)).unref();
I guess it would be good to write a similar test (hopefully a version that has to rely less on timers) that verifies that we are actually latency adjusting (currently hard-coded to 1s) the hint based keep alive?
@@ -1010,7 +1010,9 @@ function resOnFinish(req, res, socket, state, server) { | |||
} | |||
} else if (state.outgoing.length === 0) { | |||
if (server.keepAliveTimeout && typeof socket.setTimeout === 'function') { | |||
socket.setTimeout(server.keepAliveTimeout); | |||
// Increase the internal timeout wrt the advertised value to reduce likeliwood of ECONNRESET errors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Increase the internal timeout wrt the advertised value to reduce likeliwood of ECONNRESET errors | |
// Increase the internal timeout wrt the advertised value to reduce likelihood of ECONNRESET errors |
const serverHintTimeout = NumberParseInt(hint) * 1000; | ||
|
||
if (serverHintTimeout < agentTimeout) { | ||
// Let the timer expires before the announced timeout to reduce |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Let the timer expires before the announced timeout to reduce | |
// Let the timer expire before the announced timeout to reduce |
Hey, this valuable PR fixes many similar issues for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please add a test?
Hi, I'd like to help this land to resovle similar issues. Can I continue the work and add @zanettea as a co-author? To me, there are just test cases to be added |
Added 1 seconds threshold in keepalive timeout client-side
Added 1 second threshold in keepalive timeout server-side (expire the socket timeout 1 sec after the announced timeout)
Probably better to use a configurable threshold like in undici keepAliveTimeoutThreshold (nodejs/undici#291)