Allocating three times the size of the input seems excessive.

The `Encoding::decode_*` methods need in some cases to allocate a `String`, and decide how much capacity to give it. Other than `*_without_replacement` (https://github.com/hsivonen/encoding_rs/commit/2984a8b0a310b52fe7112671c5fb94446a7f78f8#commitcomment-20990260), this is based on `Encoding::max_utf8_buffer_length` which assumes the worst case. For many encodings, that’s when every byte of the input is an error that emits a three-byte U+FFFD code point.

In short, as soon as there’s an error, these method allocate *three times* the size of the (remaining) input. Assuming the worst case simplifies the code which only needs to allocate once, but it seems excessive that a single bit flip near the beginning of the input could triple memory usage.

So a more adaptive allocation scheme might be desirable, but admittedly there is no obvious answer as to what it should be.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Allocating three times the size of the input seems excessive. #6

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Allocating three times the size of the input seems excessive. #6

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions