-
-
Notifications
You must be signed in to change notification settings - Fork 31.7k
stream: socket.setEncoding(null) to receive binary Buffers rather than strings has no effect #6038
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Perhaps this might be a bug in ReadableStream based on a cursory examination of https://github.com/nodejs/node/blob/master/lib/_stream_readable.js#L193 I haven't contributed to Node before but if it seems like a correct approach to null out the encoder if |
Doc change happened in #5155. @TriAnMan @silverwind @jasnell From what I can tell passing |
Hmm, that doc change looks wrong, actually. string_decoder does fallback to |
I've using this workaround for v4.3.1 and for v0.10.36 var processServerResponse = function (data) {
data.setEncoding(null); // Fix the bug with braked utf8 symbols, converted into two unicode REPLACEMENT CHARACTER
var str = '';
data.on('data', function (chunk) {
str += chunk;
});
data.on('end', function () {
// do something with str
});
};
http.request({/* some valid http options */}, processServerResponse).end(); I don't know what is a |
If I use This is for v4.3.1, can't test v0.10.36 right now. |
@mscdex ... as far as I know, utf8 is the default when setEncoding is null. If calling |
@jasnell IIRC it does not return buffers using |
It doesn't work with 'buffer' because it's not a valid encoding. Buffer.isEncoding('buffer')
> false The problem is caused by the falsy check in string_decoder.js as @silverwind already said. 'use strict';
const cp = require('child_process');
const child = cp.spawn('ls', { stdio:['ignore','pipe','ignore'] });
// This works
child.stdout.on('readable', () => {
let chunk;
while (null !== (chunk = child.stdout.read())) {
console.log('read %s', typeof chunk);
}
});
// Type will always be a string
child.stdout.setEncoding(null);
child.stdout.on('data', (chunk) => {
console.log('onData %s', typeof chunk);
}); |
This workaround also works – pretty hacky but does the trick until this is fixed (and backported?): // Simulate socket.setEncoding(null):
if (socket._readableState) {
delete socket._readableState.decoder
delete socket._readableState.encoding
} EDIT: Sorry, just noticed this is basically in the issue description – my bad for not reading fully! |
Make `.setEncoding()` reset streams to their default state of not decoding data. Fixes: nodejs#6038
PR to fix: #12766 |
The PR to fix appears stalled. A doc-only update as at least an interim would be welcome. Labeling |
A year on, and I'm still having to apply the hack to read binary :( |
@nodejs/streams |
Ping @nodejs/streams ... @mafintosh @mcollina ... any thoughts here? |
This needs to be done. It would have to wait until December :/ |
This seems to be an issue also with the I had to use the hack to read binary body to Buffer for HTTP request. I was using Node v10.13.0. |
@jheusala As long as you don't call |
@mscdex That's what I thought, too. However I have nowhere in my code another |
Strange. It started to work also on my complete app. I could swear it was receiving strings, not Buffers, last evening. |
@nodejs/streams ... what's the status on this one? Should this remain open? |
Should this behaviour be documented at least? |
@dchest I don’t know if this can be fixed, though. There’s a specific reason why my PR wasn’t merged – the decoding step happens at the time the data is pushed into the readable stream but before it is actually read, that means, the buffered data inside the stream is potentially already decoded. We could re-encode it, but the problem is that the result is potentially different from the original input data in that case. So I think this is something that would need to be documented, yes. |
@addaleax isn't that true for any |
@mhart Right, for other encodings this is problematic as well. I think the way to fix this would be to move decoding to the point where the chunks are emitted, rather than the point where they are pushed into the stream. That would enable solving this for both |
That sounds like a way forward... though probably a bit difficult to implement. |
I've tried node 5.8.0, 5.10.0 and 4.4.2. I always receive string objects in my data handler unless I manually remove the decoder from the socket readableState:
The text was updated successfully, but these errors were encountered: