Skip to content

Consider if List<double> and possibly List<int> should be backed by dart:typed_data  #54503

Open
@mraleph

Description

@mraleph

It's always surprising to users that List<double> has poor performance characteristics, because it uses boxed representation. We could consider changing List factories to return different unboxed representation for double and possible int lists, e.g.

  1. Allocating non-growable List<double> simply returns Float64List.
  2. Growable List<double> returns a (new type) _GrowableFloat64List.

Similarly List<int> will map to Int64List and _GrowableInt64List.

As a benefit users would no longer be surprised by performance.

Draw backs:

  • Allocation sites List<T> where we can't statically exclude possibility of T being int or double become result-type-polymorphic.
  • List<int> memory size increases on 32-bit platforms and in compressed pointer mode.
  • Code size increases due to new classes being pulled in.

Alternatively we could consider a performance oriented lint which recommends Float64List whenever it sees a List<double> because it is almost never correct to use List<double>.

/cc @alexmarkov @mkustermann

Activity

added
area-vmUse area-vm for VM related issues, including code coverage, and the AOT and JIT backends.
on Jan 3, 2024
mkustermann

mkustermann commented on Jan 3, 2024

@mkustermann
Member

Generally in favor of doing something like this.

There's a few aspects I want to highlight

  • Increasing memory: The VM happens to be able to use tagged integers (aka Smis) and compressed pointers. Other backends may not be able to do the same (e.g. compiling to jvm) or would sacrifice speed/codesize when encoding top types with tagged integers (e.g. wasm i31). The argument isn't valid for doubles either, as all doubles have to be boxed and may actually reduce memory by moving to typed arrays.
  • Static handling of List<int> / List<double>: It seems a weird performance cliff iff List<T> with T==int doesn't behave the same. We could make it runtime dependent. Though how about a different idea:
  • We can expose growable equivalent types in the core libraries: Benefits
    • Users can decide whether they want packed representation or not
    • Users can use different non-int64/double variants: GrowableInt32List, GrowableFloat32List, ...
    • Somewhat of a generalization of our existing BytesBuilder class
    • We can decide if those should implement List (or not and possibly provide a asListView getter returning a wrapper)
  • There's the question of immutable / const lists: Doing it semi-automatically with existing types would allow making const <int>[1, 2, 3] a immutable Uint8List in the VM. Though it does have the other draw backs: Behaves differently on other backends, doesn't work for non-int/double variants, ... Ideally we'd support const typed data literals, see closed Const typed data (Int8List, et al.) #6378 - the main hold-back of that was JS which has heavy-weight typed data, but that may be improving with the move to wasm.
eernstg

eernstg commented on Jan 3, 2024

@eernstg
Member

I'd recommend using an implicit approach (just ask for a List<double> and get a Float64List) in any case where this is a consistent improvement, or nearly so. Also, it should be OK for an implementation to use a run-time test to make that decision, if that makes sense.

Developers would then only need to make the choice explicitly in all the cases where there is a real trade-off. It should be possible for a lint to flag exactly those cases.

The main question would be whether there are enough "consistent improvement" cases to justify this approach, and whether it's sufficiently easy to determine which cases are consistent improvements, and which ones are trade-offs.

The justification would be that we already do such things (e.g., based on redirecting factory constructors), and also that there's no reason to omit an improvement if we know we can get it.

mraleph

mraleph commented on Jan 3, 2024

@mraleph
MemberAuthor
  • Static handling of List<int> / List<double>: It seems a weird performance cliff iff List<T> with T==int doesn't behave the same. We could make it runtime dependent. Though how about a different idea:

I am not suggesting static handling - though maybe it was not clear from my proposal. Effectively I suggest:

if (T == int) return ...
if (T == double) return ...

In all factories.

That being said, I realized that static handling has the benefit that it does not introduce result-type-polymorphism for List<T> case.

Doing it semi-automatically with existing types would allow making const <int>[1, 2, 3] a immutable Uint8List in the VM.

I like this (automatically making types small) - but it was a problem: it introduces result type polymorphism. Consider for example:

final dataA = <int>[1, 2, 3];
final dataB = <int>[0xffff, 0xffff, 0xffff];

final input = c ? dataA : dataB;

use(input); // now polymorphic Uint8List | Uint16List.

There is no way of reliably magically determining optimal storage size, so I would leave this out of scope.

I think it would be better if we supported something like:

final data = const <Uint8>[1, 2, 3];  // Uint8List

This is not a valid program today (because it's type error to do this), which means it is actually possible to change language to allow this.

mkustermann

mkustermann commented on Jan 3, 2024

@mkustermann
Member

Anther aspect I haven't mentioned above is other collection types. They are unnecessarily slow (**) when used with primitive types. For example sets: <int>{1, 2, 3} or maps: <int, int>{1: 2, 2: 3}.

Analogous to above, there's also the const literal / immutable variants and the cases where we cannot see it statically.

(**) e.g. runtime/tools/heapsnapshot used to use Set<int> which can be made meaningfully faster (and more memory efficient) if specialized to integers.

added
type-performanceIssue relates to performance or code size
triagedIssue has been triaged by sub team
on Jan 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    area-vmUse area-vm for VM related issues, including code coverage, and the AOT and JIT backends.triagedIssue has been triaged by sub teamtype-performanceIssue relates to performance or code size

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @mraleph@mkustermann@a-siva@eernstg

        Issue actions

          Consider if List<double> and possibly List<int> should be backed by `dart:typed_data` · Issue #54503 · dart-lang/sdk