Description
It appears that engine.io is double-decoding (and double-encoding) UTF8 strings for polling clients. In particular, if polling clients specify any content-type besides 'application/octet-stream'
, engine.io calls req.setEncoding('utf8')
, the request data is decoded as UTF8, and the resulting string is passed on to engine.io-parser, which then attempts to call utf8.decode()
. Since the string has already been decoded from UTF8, this fails*. This is what I've observed from tinkering with the server side, and from talking to @nuclearace about his issues I suspect messages being sent to polling clients are double-encoded too.
The following demonstrates the issue (note that I added a console.log(e) in the exception handler where utf8.decode() is called):
> p.decodePayload('20:42["stringTest","π"]')
[Error: Invalid continuation byte]
Interestingly, this alternative version works:
> b = new Buffer([0x32,0x31,0x3a,0x34,0x32,0x5b,0x22,0x73,0x74,0x72,0x69,0x6e,0x67,0x54,0x65,0x73,0x74,0x22,0x2c,0x22,0xcf,0x80,0x22,0x5d])
<Buffer 32 31 3a 34 32 5b 22 73 74 72 69 6e 67 54 65 73 74 22 2c 22 cf 80 22 5d>
> b.toString()
'21:42["stringTest","π"]'
> p.decodePayload(b.toString('binary'), function (x) { console.log(x) })
{ type: 'message', data: '2["stringTest","π"]' }
So the problem to me seems to be that the parser is expecting a raw string, but since the string is coming from node's HTTP server, it has already been decoded from UTF8, and this should only happen once.
*I assume the client is double encoding, otherwise polling would be completely broken. This issue affects 3rd-party clients that are single-encoding UTF8 strings.
EDIT: Cleaned up some terminology.