Skip to content

Breaking Change Request: Make typed data classes sealed platform types. #45115

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
lrhn opened this issue Feb 25, 2021 · 20 comments
Closed

Breaking Change Request: Make typed data classes sealed platform types. #45115

lrhn opened this issue Feb 25, 2021 · 20 comments
Assignees
Labels
area-meta Cross-cutting, high-level issues (for tracking many other implementation issues, ...). breaking-change-request This tracks requests for feedback on breaking changes

Comments

@lrhn
Copy link
Member

lrhn commented Feb 25, 2021

Proposal

We propose to make the typed data classes (all types exposed by dart:typed_data except the BytesBuilder class) sealed, meaning, similarly to classes like int and String, no user- defined class can implement, extend or mix in the classes.

Rationale

The purpose of typed data (as opposed to plain List<int>s or List<double>s) is predictable good performance.
When the types can also be implemented by user classes, some optimizations become harder or more expensive.

A read like Uint8List l = ...; var u = l[0]; can be optimized to a direct memory read of one byte when we know that l is an actual Uint8List. The VM has high-performance code doing just this, but they always have to add a check for whether the list actually is a platform provided Uint8List, and fallback code for when it isn't. They cannot assume that the value will always be in the 0..255 range for range analysis because some other implementation of Uint8List might return a non-byte value.
It's possible to do better with JIT compilation, but that doesn't help AoT compilation.
It's possible to do better for whole-program compilation, but only if no class implementing Uint8List is anywhere in the program, and it doesn't help modular compilation.

Especially the Uint8List type is often used for platform integration and for isolate communication. Passing any non-standard implementation to a system call is unpredictable. It will either have to convert it to an actual byte array (a real Uint8List) before passing it on, adding overhead to the call, or reject the user-defined type (still adding at least one check as overhead).
That makes it misleading to allow user implementations of Uint8List.

Impact

There are currently no known classes which implement type data types outside of tests.
In tests, the implemented typed data classes are mocks, and are expected to be easily replaceable with an actual typed data list.

The impact is expected to be low, with only a few tests needing updates.

Mitigation

If some code is defining their own typed data implementations anyway, then they are likely to be used instead of platform typed data. Such classes can often just not implement the typed data type, but retain the similar API.
This is what Uint8Buffer and similar classes in package:typed_data does - provide a similar API with conversions to actual typed data, but without implementing the type.

The alternative is to fall back on List<int>, and only convert to, e.g., Uint8List when necessary.

@lrhn lrhn added the breaking-change-request This tracks requests for feedback on breaking changes label Feb 25, 2021
@eernstg
Copy link
Member

eernstg commented Feb 25, 2021

Sounds good!

If we introduce sealed class as a language mechanism, it would be good if these classes can appear to be sealed in exactly the same sense as they would be if declared as sealed class ..., and the same goes for int and a few other corelib classes with this kind of constraint. But I don't see any reason why that wouldn't be true.

@mit-mit mit-mit added the area-meta Cross-cutting, high-level issues (for tracking many other implementation issues, ...). label Feb 25, 2021
@leafpetersen
Copy link
Member

cc @franklinyow

@franklinyow
Copy link
Contributor

CC: @mit-mit @vsmenon @kevmoo @Hixie @matanlurey. for approval

@Hixie
Copy link
Contributor

Hixie commented Mar 2, 2021

I wish we would do the language change first, but either way, this LGTM.

@matanlurey
Copy link
Contributor

LGTM

@mit-mit
Copy link
Member

mit-mit commented Mar 3, 2021

SGTM

@vsmenon
Copy link
Member

vsmenon commented Mar 3, 2021

LGTM

@franklinyow
Copy link
Contributor

Approved

@franklinyow franklinyow assigned lrhn and unassigned franklinyow Mar 3, 2021
@mraleph
Copy link
Member

mraleph commented Mar 4, 2021

I would like to clarify @lrhn's rationale, because it gives a false impression that we would be able to generate faster code when this lands.

VM's AOT compiler already has an optimisation which it applies to a[i] and a[i] = v accesses when static type of a is a typed data list type ({Uint,Int}{8,16,32,64}List), which is not extended or implemented by any other class in the program, and concrete type information about a is not available.

Making these types sealed is not going to improve this optimisation, but it guarantees that user code will not accidentally break this optimisation (and helps hypothetical modular compilation case). Currently, if some package implements or extends a typed data list that might negatively impact performance of many unrelated typed list accesses in the whole program (except those where TFA can infer concrete type of the receiver).

A read like Uint8List l = ...; var u = l[0]; can be optimised to a direct memory read of one byte when we know that l is an actual Uint8List.

It is worth noting that VM has multiple internal implementations of Uint8List (internal, external and views), however their memory layout is unified in a way which makes it possible to emit the same instruction sequence for a load and store even when the concrete type of the instance is not known. This instruction sequence is two loads: l[0] becomes (l.dataPtr)[0] (for internal typed data dataPtr would be an inner pointer pointing into the object l itself). One load instruction sequences can only be emitted when l is known to be an instance of internal typed data, in which case you can emit (addressOf(l) + headerSize)[0] for reads.

@lrhn
Copy link
Member Author

lrhn commented Mar 4, 2021

Thanks @mraleph

Yes, this does not enable any new optimizations that weren't already possible in JIT code, or even in AOT code. It just ensures that the optimization can always be applied, and nothing can interfere with it (by disabling it, or by adding extra necessary checks prior to using the optimized code).

This is about making the improved performance predictable, and hopefully accessible to modular compilation too.

lrhn added a commit to lrhn/flutter that referenced this issue Apr 8, 2021
The test `test/widgets/image_resolution_test.dart` has a test class implementing `ByteData`.
Dart will make it a compile-time error to extend/implement/mix-in most of the types of `dart:typed_data` (breaking change: dart-lang/sdk#45115, implementation: https://dart-review.googlesource.com/c/sdk/+/192186), so this test needs to stop doing so.

This is an attempt at a *minimal change* (because I am not familiar with the code at all). It's definitely possible to make other simplicifcations (like inlining the extension).
@franklinyow
Copy link
Contributor

@lrhn What is the status of this change?

@lrhn
Copy link
Member Author

lrhn commented May 7, 2021

This change has landed within the last few days. It seems like it will stick.

@mit-mit
Copy link
Member

mit-mit commented May 7, 2021

OK, closing then. Kindly re-open should it get reverted.

This should go out in the 2.14 stable release.

@ilap
Copy link

ilap commented May 24, 2021

My packages heavily use and depend on non-sealed Uint8List class, and will be completely useless without that feature.

Is there any way to overcome on this restriction somehow?

/// `ByteList` is the base of the PineNaCl cryptographic library,
/// which is based on the unmodifiable Uin8List class
class ByteList with ListMixin<int>, Encodable implements Uint8List {
  ByteList(Iterable<int> bytes, [int? bytesLength])
      : _u8l = _constructList(
            bytes, bytesLength ?? bytes.length, bytesLength ?? bytes.length);

  ByteList.fromList(List<int> list,
      [int minLength = _minLength, int maxLength = _maxLength])
      : _u8l = _constructList(list, minLength, maxLength);

  factory ByteList.decode(String data, [Encoder defaultDecoder = decoder]) {
    return defaultDecoder.decode(data);
  }

  static const _minLength = 0;

  // Maximum message/bytes' length
  static const _maxLength = 16384;

  final Uint8List _u8l;

  static Uint8List _constructList(
      Iterable<int> list, int minLength, int maxLength) {
    if (list.length < minLength || list.length > maxLength) {
      throw Exception(
          'The list length (${list.length}) is invalid (min: $minLength, max: $maxLength)');
    }
    return UnmodifiableUint8ListView(Uint8List.fromList(list.toList()));
  }

  // Default encoder/decoder is the HexCoder()
  static const decoder = HexCoder.instance;

  @override
  Encoder get encoder => decoder;

  // Original getters/setters
  @override
  set length(int newLength) => _u8l.length = newLength;

  @override
  int get length => _u8l.length;

  @override
  int operator [](int index) => _u8l[index];

  @override
  operator []=(int index, value) {
    _u8l[index] = value;
  }

  @override
  ByteBuffer get buffer => _u8l.buffer;

  @override
  int get elementSizeInBytes => _u8l.elementSizeInBytes;

  @override
  int get lengthInBytes => _u8l.lengthInBytes;

  @override
  int get offsetInBytes => _u8l.offsetInBytes;

  @override
  bool operator ==(Object other) {
    var isEqual = identical(this, other) ||
        other is ByteList &&
            runtimeType == other.runtimeType &&
            length == other.length;

    if (!isEqual) return false;

    for (var i = 0; i < length; i++) {
      if (this[i] != (other as List)[i]) return false;
    }
    return true;
  }

  @override
  ByteList sublist(int start, [int? end]) {
    final sublist = _u8l.sublist(start, end ?? _u8l.length);
    return ByteList(sublist, sublist.length);
  }
}

mixin Suffix on ByteList {
  late final int prefixLength;
  ByteList get prefix => ByteList(take(prefixLength), prefixLength);
  ByteList get suffix => ByteList(skip(prefixLength), length - prefixLength);
}

@kevmoo
Copy link
Member

kevmoo commented May 24, 2021

Have you looked at using extensions, @ilap ?

There are a few types that effectively NEED to be sealed, for performance reasons. int, String, bool were already on this list.

@ilap
Copy link

ilap commented May 24, 2021

Have you looked at using extensions, @ilap ?

There are a few types that effectively NEED to be sealed, for performance reasons. int, String, bool were already on this list.

I think I will implement ByteList as List<int>, which would work I think.

@mraleph
Copy link
Member

mraleph commented May 25, 2021

@ilap if you want your code to be performant then you should pass around your ByteList object, but whenever you want to work with underlying bytes you should unwrap it into underlying Uint8List without copying, but simply by returning underlying list (e.g. get asTypedList => _u8l;). I notice that one of the approaches you are considering is to do Uint8List.fromList(byteList) - this is not going to have good performance.

@ilap
Copy link

ilap commented May 25, 2021

@ilap if you want your code to be performant then you should pass around your ByteList object, but whenever you want to work with underlying bytes you should unwrap it into underlying Uint8List without copying, but simply by returning underlying list (e.g. get asTypedList => _u8l;). I notice that one of the approaches you are considering is to do Uint8List.fromList(byteList) - this is not going to have good performance.

Hi, I am not considering copying bytes, but only want to use the underlying Uint8List.
Until the breaking change, it worked perfectly as ByteList was handled as a list, Uint8List to be precise, in any functions/classes.

Also, I have already added a similar getter to access to it directly, but all functions/classes must be changed to reflect that change, as ByteList would not be a list anymore. Therefore, I was thinking/considering to extends ByteList w/ List<int> (implements List) instead, but not sure what are the drawbacks for that scenario. Though, it works, but not sure.

Performance wise I got similar benchmark result, but it seems to me a little bit dirty-hacking.

@yd4011439
Copy link

Why not just provide new classes like NativeUint8List and not break so many existing projects.

@mraleph
Copy link
Member

mraleph commented Nov 4, 2024

@yd4011439 you are commenting on a change that has been executed in 2021. It is done.

The rationale is explained above: we wanted types like Uint8List to be fast, which helps all the code already written using these types at the expense of breaking some small amount of packages which made a mistake subclassing these.

not break so many existing projects.

To the best of my knowledge this has not been very breaking. You might be actually thinking about a different breaking change in the same area, which was much more breaking: #53218

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-meta Cross-cutting, high-level issues (for tracking many other implementation issues, ...). breaking-change-request This tracks requests for feedback on breaking changes
Projects
None yet
Development

No branches or pull requests