Skip to content

Consider if List<double> and possibly List<int> should be backed by dart:typed_data #54503

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
mraleph opened this issue Jan 3, 2024 · 4 comments
Labels
area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. triaged Issue has been triaged by sub team type-performance Issue relates to performance or code size

Comments

@mraleph
Copy link
Member

mraleph commented Jan 3, 2024

It's always surprising to users that List<double> has poor performance characteristics, because it uses boxed representation. We could consider changing List factories to return different unboxed representation for double and possible int lists, e.g.

  1. Allocating non-growable List<double> simply returns Float64List.
  2. Growable List<double> returns a (new type) _GrowableFloat64List.

Similarly List<int> will map to Int64List and _GrowableInt64List.

As a benefit users would no longer be surprised by performance.

Draw backs:

  • Allocation sites List<T> where we can't statically exclude possibility of T being int or double become result-type-polymorphic.
  • List<int> memory size increases on 32-bit platforms and in compressed pointer mode.
  • Code size increases due to new classes being pulled in.

Alternatively we could consider a performance oriented lint which recommends Float64List whenever it sees a List<double> because it is almost never correct to use List<double>.

/cc @alexmarkov @mkustermann

@mraleph mraleph added the area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. label Jan 3, 2024
@mkustermann
Copy link
Member

Generally in favor of doing something like this.

There's a few aspects I want to highlight

  • Increasing memory: The VM happens to be able to use tagged integers (aka Smis) and compressed pointers. Other backends may not be able to do the same (e.g. compiling to jvm) or would sacrifice speed/codesize when encoding top types with tagged integers (e.g. wasm i31). The argument isn't valid for doubles either, as all doubles have to be boxed and may actually reduce memory by moving to typed arrays.
  • Static handling of List<int> / List<double>: It seems a weird performance cliff iff List<T> with T==int doesn't behave the same. We could make it runtime dependent. Though how about a different idea:
  • We can expose growable equivalent types in the core libraries: Benefits
    • Users can decide whether they want packed representation or not
    • Users can use different non-int64/double variants: GrowableInt32List, GrowableFloat32List, ...
    • Somewhat of a generalization of our existing BytesBuilder class
    • We can decide if those should implement List (or not and possibly provide a asListView getter returning a wrapper)
  • There's the question of immutable / const lists: Doing it semi-automatically with existing types would allow making const <int>[1, 2, 3] a immutable Uint8List in the VM. Though it does have the other draw backs: Behaves differently on other backends, doesn't work for non-int/double variants, ... Ideally we'd support const typed data literals, see closed Const typed data (Int8List, et al.) #6378 - the main hold-back of that was JS which has heavy-weight typed data, but that may be improving with the move to wasm.

@eernstg
Copy link
Member

eernstg commented Jan 3, 2024

I'd recommend using an implicit approach (just ask for a List<double> and get a Float64List) in any case where this is a consistent improvement, or nearly so. Also, it should be OK for an implementation to use a run-time test to make that decision, if that makes sense.

Developers would then only need to make the choice explicitly in all the cases where there is a real trade-off. It should be possible for a lint to flag exactly those cases.

The main question would be whether there are enough "consistent improvement" cases to justify this approach, and whether it's sufficiently easy to determine which cases are consistent improvements, and which ones are trade-offs.

The justification would be that we already do such things (e.g., based on redirecting factory constructors), and also that there's no reason to omit an improvement if we know we can get it.

@mraleph
Copy link
Member Author

mraleph commented Jan 3, 2024

  • Static handling of List<int> / List<double>: It seems a weird performance cliff iff List<T> with T==int doesn't behave the same. We could make it runtime dependent. Though how about a different idea:

I am not suggesting static handling - though maybe it was not clear from my proposal. Effectively I suggest:

if (T == int) return ...
if (T == double) return ...

In all factories.

That being said, I realized that static handling has the benefit that it does not introduce result-type-polymorphism for List<T> case.

Doing it semi-automatically with existing types would allow making const <int>[1, 2, 3] a immutable Uint8List in the VM.

I like this (automatically making types small) - but it was a problem: it introduces result type polymorphism. Consider for example:

final dataA = <int>[1, 2, 3];
final dataB = <int>[0xffff, 0xffff, 0xffff];

final input = c ? dataA : dataB;

use(input); // now polymorphic Uint8List | Uint16List.

There is no way of reliably magically determining optimal storage size, so I would leave this out of scope.

I think it would be better if we supported something like:

final data = const <Uint8>[1, 2, 3];  // Uint8List

This is not a valid program today (because it's type error to do this), which means it is actually possible to change language to allow this.

@mkustermann
Copy link
Member

Anther aspect I haven't mentioned above is other collection types. They are unnecessarily slow (**) when used with primitive types. For example sets: <int>{1, 2, 3} or maps: <int, int>{1: 2, 2: 3}.

Analogous to above, there's also the const literal / immutable variants and the cases where we cannot see it statically.

(**) e.g. runtime/tools/heapsnapshot used to use Set<int> which can be made meaningfully faster (and more memory efficient) if specialized to integers.

@a-siva a-siva added type-performance Issue relates to performance or code size triaged Issue has been triaged by sub team labels Jan 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. triaged Issue has been triaged by sub team type-performance Issue relates to performance or code size
Projects
None yet
Development

No branches or pull requests

4 participants