Skip to content

Generate bytes with Uint8List #219

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
truongsinh opened this issue Mar 12, 2019 · 15 comments
Open

Generate bytes with Uint8List #219

truongsinh opened this issue Mar 12, 2019 · 15 comments
Labels
feature request perf Related to runtime performance

Comments

@truongsinh
Copy link

truongsinh commented Mar 12, 2019

Ref: https://groups.google.com/a/dartlang.org/forum/#!topic/misc/JpNkRRLI9_w, daegalus/dart-uuid#35

More and more packages are explicitly switching to Uint8List

Right now, with the folling proto

message BluetoothCharacteristic {
  bytes value = 6;
}

The generated dart code (by Activated protoc_plugin 16.0.1.) is

import 'dart:core' show int, bool, double, String, List, Map, override;

import 'package:protobuf/protobuf.dart' as $pb;

class BluetoothCharacteristic extends $pb.GeneratedMessage {
  static final $pb.BuilderInfo _i = new $pb.BuilderInfo('BluetoothCharacteristic')
    ..a<List<int>>(6, 'value', $pb.PbFieldType.OY)
    ..hasRequiredFields = false
  ;

  BluetoothCharacteristic() : super();
  BluetoothCharacteristic.fromBuffer(List<int> i, [$pb.ExtensionRegistry r = $pb.ExtensionRegistry.EMPTY]) : super.fromBuffer(i, r);
  BluetoothCharacteristic.fromJson(String i, [$pb.ExtensionRegistry r = $pb.ExtensionRegistry.EMPTY]) : super.fromJson(i, r);
  BluetoothCharacteristic clone() => new BluetoothCharacteristic()..mergeFromMessage(this);
  BluetoothCharacteristic copyWith(void Function(BluetoothCharacteristic) updates) => super.copyWith((message) => updates(message as BluetoothCharacteristic));
  $pb.BuilderInfo get info_ => _i;
  static BluetoothCharacteristic create() => new BluetoothCharacteristic();
  BluetoothCharacteristic createEmptyInstance() => create();
  static $pb.PbList<BluetoothCharacteristic> createRepeated() => new $pb.PbList<BluetoothCharacteristic>();
  static BluetoothCharacteristic getDefault() => _defaultInstance ??= create()..freeze();
  static BluetoothCharacteristic _defaultInstance;

  List<int> get value => $_getN(0);
  set value(List<int> v) { $_setBytes(0, v); }
  bool hasValue() => $_has(0);
  void clearValue() => clearField(6);
}

We can see that we still have

  List<int> get value => $_getN(0);
  set value(List<int> v) { $_setBytes(0, v); }

Expected generated code:

  Uint8List get value => $_getN(0);
  set value(Uint8List v) { $_setBytes(0, v); }

One concern might be that this is more or less a breaking change in strong mode.

@sigurdm
Copy link
Collaborator

sigurdm commented Mar 13, 2019

Can you be more specific?
GeneratedMessage.writeToBuffer already returns a Uint8List.

@truongsinh
Copy link
Author

@sigmundch I updated the description to have current behavior and desired behavior.

@njskalski
Copy link

Just hit the same wall.

@rajveermalviya
Copy link

@sigurdm Any traction on this?

most of the dart:core is using Uint8List for bytes, instead of List<int>, since Dart v2.5.0.

ref: dart-lang/sdk#36900

@osa1 osa1 self-assigned this Apr 11, 2022
@osa1
Copy link
Member

osa1 commented Jul 15, 2022

Relatedly, we currently do not convert List<int> values to Uint8List when a field is set. For example:

  @$pb.TagNumber(15)
  set optionalBytes($core.List<$core.int> v) {
    $_setBytes(14, v);
  }

which calls

/// For generated code only.
void $_setBytes(int index, List<int> value) => _fieldSet._$set(index, value);

So the value is stored as List<int>, which is wasteful because I suspect most subtypes won't be as compact as Uint8List. Also when serializing, we allocate a Uint8List every time we serialize a bytes field:

case PbFieldType._BYTES_BIT:
_writeBytesNoTag(
value is TypedData ? value : Uint8List.fromList(value));
break;

I think the "ideal" solution would be to take Uint8List values in setters and return Uint8List in getters, and store bytes values as Uint8List. Uint8List is the most precise type for what a "bytes" field is, and it's more efficient than storing each byte as an int.

That's a breaking change though and I'm not sure if we can afford it internally. So a close second could be keeping the API as-is, but storing the values as Uint8List. I think technically this would not be considered a breaking change, though I wouldn't surprise if it breaks some code because someone sets a bytes field an int array or something like that, and downcasts the getter return value.

@osa1 osa1 added the perf Related to runtime performance label Aug 10, 2022
@osa1
Copy link
Member

osa1 commented Aug 11, 2022

Also when serializing, we allocate a Uint8List every time we serialize a bytes field:

I just realized that this is not true, because Uint8List is a subtype of TypedData, so in the code

case PbFieldType._BYTES_BIT:
_writeBytesNoTag(
value is TypedData ? value : Uint8List.fromList(value));
break;

We don't allocate a Uint8List if the field value is already an Uint8List.

Another strange thing that I realized while reading the 4 lines shown above is that when we set a bytes field an List<int> with integers larger than max value of a byte, truncation is done when serializing, not when setting the field. Example:

syntax = "proto3";

message M1 {
  bytes b = 1;
}
M1 m = M1();
m.b = <int>[99999];
print(m.b); // [99999]
print(m.writeToBuffer()); // [10, 1, 159]

The reason why this is strange is because if I send this message to another program I would expect the local value for the message and the received message value to be the same, but if both the sender and the receiver use the bytes field for the same operation they will potentially do different things as the bytes values are different.

Oh, and also, the JSON map and proto3 JSON serializers throw an exception when serializing this field.. So the library is also inconsistent in handling of such values.

@brandsimon
Copy link

I understand that this could break someones code and is not backwards compatible, but I think having an Uint8List as bytes is the correct way to do it. So how about adding a comment to the .proto file or a command-line option for the protobuf-compiler which changes the resulting class? So people can choose which type they want.

@osa1 osa1 removed their assignment Dec 30, 2022
@hagerf
Copy link

hagerf commented Mar 19, 2023

Hi
Is there any development on this? I agree with everyone here that Uint8List is the proper type for bytes. But is it probable that these changes will be made anytime or are we probably stuck with current implementation for the foreseeable future? Thanks

@temeddix
Copy link

This is indeed a serious issue. List<int> is known to be much slower than Uint8List. Any progress?

@xalanq
Copy link

xalanq commented Nov 17, 2024

Any progress?

@markg85
Copy link

markg85 commented Mar 2, 2025

I'm curious about progress here too. I'm using the Avif package to get avif images in Dart. It also made me profile a little as loading half a dozen frames on screen gives a notable stutter. Now if i'm reading the flamegraph correctly then the conversion take a very large part of image processing. It seems in this specific case that converting the container formats takes more time then decoding the actual image.

This is with a small caveat. I'm not entirely sure about this opinion yet. This is a stack of tech that i'm not too familiar with.

Image

If my assumption is even correct, which some testing does seem to prove it is, then having a Uint8List path is much desired.
I hacked the generated protobuf files a little (find/replace List to Uint8List) and it works (partly..) and makes the flowchart look like this now:

Image

However, another spot in the same code that proves a bit more tricky to fix in the same manner:

Image

This calls mergeFromBuffer (in the protobuf package) which looks like this:

  void mergeFromBuffer(List<int> input,
      [ExtensionRegistry extensionRegistry = ExtensionRegistry.EMPTY]) {
    final codedInput = CodedBufferReader(input);
    final meta = _fieldSet._meta;
    _mergeFromCodedBufferReader(meta, _fieldSet, codedInput, extensionRegistry);
    codedInput.checkLastTagWas(0);
  }

Replacing that with Uint8List does begin to break other packages i also have installed..

@osa1
Copy link
Member

osa1 commented Mar 2, 2025

@markg85 There shouldn't be any performance issues caused by Dart protobuf library accepting List<int> instead of Uint8List in bytes fields, as long as you always set bytes fields Uint8Lists.

Can you show me what exactly you changed in the protobuf library that made a difference in the flame graphs?

Some notes:

  • mergeFromBuffer code that you show is not an issue, CodedBufferReader decodes a Uint8List directly when it's passed a Uint8List. So if you pass it a Uint8List it won't copy the input:

    : _buffer = buffer is Uint8List ? buffer : Uint8List.fromList(buffer),

  • When decoding a bytes field you have to copy the bytes even when the input is a Uint8List, because otherwise changing the input contents will change the bytes field. That's why you see a Uint8List.fromList in your second flame graph. We can't get rid of this.

  • Avif may be unnecessarily copying frame data when decoding: https://github.com/yekeskin/flutter_avif/blob/2fa190d19d1732dd5725cfb08ebe8e1ff0c635a8/flutter_avif/lib/src/avif_image.dart#L1012

    I don't know what decodeImageFromPixels is doing, but if it's not keeping a reference to the Uint8List argument, you shouldn't need to copy frame.data before calling decodeImageFromPixels.

    Edit: I realize now that the reason why it's copying is because the bytes field getter is returning List<int> but the decoder is expecting Uint8List. Avif should do a type test to avoid copying: frame.data is Uint8List ? frame.data as Uint8List : Uint8List.fromList(frame.data).

@markg85
Copy link

markg85 commented Mar 2, 2025

@osa1

Thank you so much for that detailed (educational!) response!

I was just about to grab the changes i made in the protobuf file when i noticed your edit :)
You're spot on!

Edit: I realize now that the reason why it's copying is because the bytes field getter is returning List<int> but the decoder is expecting Uint8List. Avif should do a type test to avoid copying: frame.data is Uint8List ? frame.data as Uint8List : Uint8List.fromList(frame.data).

I did change the protobuf to replace the List<int> with Uint8List, after doing that it indeed became possible to replace that Uint8List.fromList(frame.data), to just frame.data, (in the call to decodeImageFromPixels). Sorry for not mentioning this before, that little additional change probably made the whole copy go away and affect the flamegraph.

Regarding that last framegraph and the remark around mergeFromBuffer. What you say sounds logical. However in the overall time spend within a frame, more time is spend in that function than is spend in decoding the actual image. Is there perhaps a more efficient way of doing this?

For clarity, the flamegraph showing this:

Image

In total the time distribution is about this:
3/5th of the time is spend on layout (not shown)
2/5th on image related things

The image is of that 2/5th part.
You can see the image decoding in there, it's about half the size of the mergeFromBuffer step. Now i know (well from C++ that is) that image decoding usually should be the chunk of the time in rendering an image on screen, its just the heaviest. Memory related features (especially like copy) should barely register compared to decoding. Yet here in Dart/protobuf the opposite is the case which does at least hint at something very inefficient going on.

@osa1
Copy link
Member

osa1 commented Mar 2, 2025

You can see the image decoding in there, it's about half the size of the mergeFromBuffer step. Now i know (well from C++ that is) that image decoding usually should be the chunk of the time in rendering an image on screen, its just the heaviest. Memory related features (especially like copy) should barely register compared to decoding. Yet here in Dart/protobuf the opposite is the case which does at least hint at something very inefficient going on.

My guess is you have a lot of very small bytes fields, so overheads of checking length of the field, allocating a new Uint8List, and other cheap operations take as much time as actually copying the bytes, because there are only a few bytes to copy. Could you check if this is the case?

Otherwise I don't know how else to explain this flamegraph, readBytes literally just checks the field size, allocates a new Uint8List, and copies the bytes.

@markg85
Copy link

markg85 commented Mar 2, 2025

Not that many, really.

This is loading "thumbnail sized" images. Now you do have pixel ratio so thumbnail still blows up to about 588x331. And that's about 40 images.

But in that same time you have the image decoding, which you can also see.
Images like these and that is an actual size that would've been loaded. Mine are however in avif format (say 20kb per picture):

Image

The flamegraph is from not visible to fully rendered. So what you see includes decoding of those images too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request perf Related to runtime performance
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants